add a method to access Google Cloud voice API with credential file by thzjy · Pull Request #2598 · zhayujie/chatgpt-on-wechat

thzjy · 2025-05-28T12:37:55Z

No description provided.

safe1ine · 2026-02-06T09:43:45Z

@MonkeyCode-AI review 一下

MonkeyCode-AI · 2026-02-06T09:44:09Z

MonkeyCode-AI 正在分析任务...

MonkeyCode-AI

我是 MonkeyCode AI 编程助手，你可以在 GitHub 仓库的 PR 中 at @MonkeyCode-AI 来呼唤我。

任务执行细节请参考: https://monkeycode-ai.com/tasks/public?id=efa8295a-4056-437e-ba5b-9933898aba10

代码审查结果

新增 Google Cloud STT/TTS 适配实现了基本功能，但模块导入即写入凭据环境变量带来安全/部署副作用，且音频编码一致性与临时文件清理等可靠性问题需修复后再合并。

✨ 代码亮点

STT/TTS 均封装为 Voice 子类方法，调用侧使用 Reply/ReplyType 统一返回
对 AMR/MP3 先转 WAV 再识别，具备基本的格式兼容思路

🚨 Critical	⚠️ Warning	💡 Suggestion
1	2	1

MonkeyCode-AI · 2026-02-06T09:46:50Z

voice/google/google_voice2.py

+cred_path = os.path.join(os.path.dirname(__file__), "google-credentials.json")
+os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = cred_path


Caution
🚨 导入模块即设置 GOOGLE_APPLICATION_CREDENTIALS，且强依赖仓库目录内凭据文件，存在安全/部署风险与副作用

模块顶层拼接 google-credentials.json 路径并写入 os.environ["GOOGLE_APPLICATION_CREDENTIALS"]：1) 导入即产生全局副作用，影响进程内其他 Google SDK 客户端/模块；2) 强依赖代码目录存在凭据文件，容器/线上环境/只读文件系统下易失效；3) 诱导将密钥文件放入仓库目录，凭据泄露风险高。更合理的是使用 Application Default Credentials（ADC）；如需指定 key file，应通过配置/参数显式传入，并在初始化时用显式凭据创建 client，而非改全局环境变量。

建议: 移除模块顶层环境变量写入；在 init 支持通过环境变量/配置传入 key file 路径，并使用 service_account.Credentials.from_service_account_file 创建客户端；未提供则走默认凭据（ADC）。

Suggested change

cred_path = os.path.join(os.path.dirname(__file__), "google-credentials.json")

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = cred_path

import os

import time

import uuid

from google.cloud import speech

from google.cloud import texttospeech

from google.api_core.exceptions import GoogleAPIError

from google.oauth2 import service_account

from pydub import AudioSegment

from bridge.reply import Reply, ReplyType

from common.log import logger

from common.tmp_dir import TmpDir

from voice.voice import Voice

class GoogleVoice(Voice):

def __init__(self, credentials_path: str | None = None):

super().__init__()

credentials_path = credentials_path or os.getenv("GOOGLE_APPLICATION_CREDENTIALS")

if credentials_path:

credentials = service_account.Credentials.from_service_account_file(credentials_path)

self.speech_client = speech.SpeechClient(credentials=credentials)

self.tts_client = texttospeech.TextToSpeechClient(credentials=credentials)

else:

self.speech_client = speech.SpeechClient()

self.tts_client = texttospeech.TextToSpeechClient()

MonkeyCode-AI · 2026-02-06T09:46:50Z

voice/google/google_voice2.py

+            )
+
+            # 执行语音识别
+            response = self.speech_client.recognize(config=config, audio=audio)


Warning
⚠️ 语音识别配置固定为 LINEAR16/16000，但输入 WAV 实际编码可能不匹配，可能导致识别失败或效果差

convert_audio_to_wav 仅设置采样率/声道并导出 wav，但未显式保证导出为 16-bit PCM（LINEAR16）。当输入本身为 wav 时又直接按 LINEAR16/16000 送入 API，若实际为其他采样率或压缩编码，会导致识别报错或质量下降。

建议: 对所有输入统一转成 PCM16 mono 16000Hz，并在 config 中与转换后的参数保持一致；至少在 wav 分支也执行转换以保证一致性。

MonkeyCode-AI · 2026-02-06T09:46:50Z

voice/google/google_voice2.py

+        try:
+            file_ext = os.path.splitext(voice_file)[1].lower()
+            if file_ext in [".amr", ".mp3"]:
+                temp_wav_file = f"temp_audio_{uuid.uuid4().hex}.wav" 
+                voice_file = self.convert_audio_to_wav(voice_file, temp_wav_file)
+                if not voice_file:
+                    logger.error("音频转换失败")
+                    return Reply(ReplyType.ERROR, "音频转换失败")
+            elif file_ext != ".wav":
+                logger.error("不支持的音频格式，仅支持 AMR、MP3 和 WAV")
+                return Reply(ReplyType.ERROR, "不支持的音频格式，仅支持 AMR、MP3 和 WAV")
+
+            with open(voice_file, "rb") as audio_file:
+                audio_content = audio_file.read()
+
+            # 配置音频和识别设置（中文普通话）
+            audio = speech.RecognitionAudio(content=audio_content)
+            config = speech.RecognitionConfig(
+                encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
+                sample_rate_hertz=16000,
+                language_code="cmn-CN", 
+            )
+
+            # 执行语音识别
+            response = self.speech_client.recognize(config=config, audio=audio)
+
+            # 提取转录结果
+            transcript = ""
+            for result in response.results:
+                transcript += result.alternatives[0].transcript + " "
+
+            transcript = transcript.strip()
+            if not transcript:
+                logger.error("语音识别失败：无法理解音频内容")
+                return Reply(ReplyType.ERROR, "抱歉，我听不懂")
+
+            logger.info(f"[Google] voiceToText text={transcript} voice file name={voice_file}")
+            reply = Reply(ReplyType.TEXT, transcript)
+
+            # 清理临时 WAV 文件
+            if file_ext in [".amr", ".mp3"] and os.path.exists(voice_file):
+                os.remove(voice_file)
+
+            return reply
+


Warning
⚠️ 临时 WAV 文件清理不可靠：异常路径/早返回时可能泄漏临时文件

目前仅在成功路径末尾依据 file_ext 清理临时 wav。若 recognize 抛异常或中途 return（如 transcript 为空）临时文件会遗留。此外 voice_file 变量在 amr/mp3 场景会被覆盖为临时 wav 路径，清理逻辑再依赖 file_ext 判断，容易遗漏其他转换场景。

建议: 使用 try/finally 统一清理临时文件；用单独变量 temp_wav_path 保存临时文件路径，避免覆盖原始 voice_file。

MonkeyCode-AI · 2026-02-06T09:46:51Z

voice/google/note.txt

@@ -0,0 +1 @@
+需要在google cloud控制台创建授权项目，分配IAM角色和权限,下载自己的密钥文件, 把密钥文件命名为google-credentials.json放在本目录。


Tip
💡 缺少安全指引：密钥文件放入代码目录有泄漏风险

note.txt 指导将密钥文件放在本目录并使用固定文件名，容易被误提交到仓库或打包进镜像，导致凭据泄漏。

建议: 补充安全指引：通过环境变量/Secret Manager/挂载方式提供凭据并确保被 .gitignore 忽略；推荐使用 ADC 或工作负载身份。

thzjy added 2 commits May 28, 2025 20:30

add method to access Google Cloud voice API with credential path

fa8acec

add method to access Google Cloud voice API with credential path

bba6f97

MonkeyCode-AI reviewed Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a method to access Google Cloud voice API with credential file#2598

add a method to access Google Cloud voice API with credential file#2598
thzjy wants to merge 2 commits intozhayujie:masterfrom
thzjy:add_method/google-cloud-voice

thzjy commented May 28, 2025

Uh oh!

safe1ine commented Feb 6, 2026

Uh oh!

MonkeyCode-AI commented Feb 6, 2026

Uh oh!

MonkeyCode-AI left a comment

Uh oh!

MonkeyCode-AI Feb 6, 2026

Uh oh!

MonkeyCode-AI Feb 6, 2026

Uh oh!

MonkeyCode-AI Feb 6, 2026

Uh oh!

MonkeyCode-AI Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		cred_path = os.path.join(os.path.dirname(__file__), "google-credentials.json")
		os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = cred_path

-cred_path = os.path.join(os.path.dirname(__file__), "google-credentials.json")
-os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = cred_path
+import os
+import time
+import uuid
+from google.cloud import speech
+from google.cloud import texttospeech
+from google.api_core.exceptions import GoogleAPIError
+from google.oauth2 import service_account
+from pydub import AudioSegment
+from bridge.reply import Reply, ReplyType
+from common.log import logger
+from common.tmp_dir import TmpDir
+from voice.voice import Voice
+class GoogleVoice(Voice):
+    def __init__(self, credentials_path: str | None = None):
+        super().__init__()
+        credentials_path = credentials_path or os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
+        if credentials_path:
+            credentials = service_account.Credentials.from_service_account_file(credentials_path)
+            self.speech_client = speech.SpeechClient(credentials=credentials)
+            self.tts_client = texttospeech.TextToSpeechClient(credentials=credentials)
+        else:
+            self.speech_client = speech.SpeechClient()
+            self.tts_client = texttospeech.TextToSpeechClient()

		@@ -0,0 +1 @@
		需要在google cloud控制台创建授权项目，分配IAM角色和权限,下载自己的密钥文件, 把密钥文件命名为google-credentials.json放在本目录。

Conversation

thzjy commented May 28, 2025

Uh oh!

safe1ine commented Feb 6, 2026

Uh oh!

MonkeyCode-AI commented Feb 6, 2026

Uh oh!

MonkeyCode-AI left a comment

Choose a reason for hiding this comment

代码审查结果

✨ 代码亮点

Uh oh!

MonkeyCode-AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

MonkeyCode-AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

MonkeyCode-AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

MonkeyCode-AI Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants