Problem
When using the Custom post-processing provider with DeepSeek's OpenAI-compatible endpoint:
- Base URL:
https://api.deepseek.com
- Model:
deepseek-v4-flash
post-processing fails with HTTP 400 because Handy sends:
"reasoning_effort": "none"
DeepSeek rejects this field/value:
reasoning_effort: unknown variant `none`, expected one of `high`, `low`, `medium`, `max`, `xhigh`
Context
PR #1221 added reasoning_effort: "none" for Custom providers to avoid thinking-mode latency in local models such as Ollama/Gemma/Qwen. That behavior is useful and should remain for those endpoints.
The issue is that not all OpenAI-compatible APIs accept the same reasoning-control parameters. DeepSeek uses a different thinking object for V4 thinking mode, and rejects reasoning_effort: "none".
DeepSeek docs: https://api-docs.deepseek.com/guides/thinking_mode
Expected behavior
Custom provider should still work with DeepSeek-compatible endpoints.
Ideally:
- Preserve
reasoning_effort: "none" for local Custom endpoints where it works.
- Avoid sending
reasoning_effort: "none" to DeepSeek.
- Optionally send
thinking: { "type": "disabled" } for DeepSeek to avoid extra latency during transcript cleanup.
Actual behavior
Post-processing fails and Handy falls back to the original transcription.
Logs
LLM post-processing failed for provider 'custom': API request failed with status 400 Bad Request:
{"error":{"message":"Failed to deserialize the JSON body into the target type: reasoning_effort: unknown variant `none` ..."}}
Notes
This is related to #1221, but not a request to remove the Custom reasoning optimization globally. The local-model latency improvement should be preserved.
Problem
When using the Custom post-processing provider with DeepSeek's OpenAI-compatible endpoint:
https://api.deepseek.comdeepseek-v4-flashpost-processing fails with HTTP 400 because Handy sends:
DeepSeek rejects this field/value:
Context
PR #1221 added
reasoning_effort: "none"for Custom providers to avoid thinking-mode latency in local models such as Ollama/Gemma/Qwen. That behavior is useful and should remain for those endpoints.The issue is that not all OpenAI-compatible APIs accept the same reasoning-control parameters. DeepSeek uses a different
thinkingobject for V4 thinking mode, and rejectsreasoning_effort: "none".DeepSeek docs: https://api-docs.deepseek.com/guides/thinking_mode
Expected behavior
Custom provider should still work with DeepSeek-compatible endpoints.
Ideally:
reasoning_effort: "none"for local Custom endpoints where it works.reasoning_effort: "none"to DeepSeek.thinking: { "type": "disabled" }for DeepSeek to avoid extra latency during transcript cleanup.Actual behavior
Post-processing fails and Handy falls back to the original transcription.
Logs
Notes
This is related to #1221, but not a request to remove the Custom reasoning optimization globally. The local-model latency improvement should be preserved.