-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Problem
When an OpenAI-compatible inference server (e.g., Exo, vLLM, SGLang) sends a mid-stream error event during SSE streaming, Goose crashes with:
Stream decode error: error decoding response body
This happens because response_to_streaming_message in providers/formats/openai.rs deserializes every data: line as a StreamingChunk (serde_json::from_str). When the server sends an error object like:
data: {"error": {"message": "Internal server error", "type": "InternalServerError", "code": 500}}
...deserialization fails because there's no choices field, and the error propagates as ProviderError::RequestFailed("Stream decode error: ..."), killing the session.
Expected behavior
Goose should check for an error key in parsed SSE data before attempting to deserialize as a StreamingChunk. This is the same pattern used by the official OpenAI Python client (openai-python/_streaming.py), which checks data.get("error") and raises APIError.
Context
Sending {"error": {...}} as an SSE data: event is the de facto standard for mid-stream errors across OpenAI-compatible servers:
- OpenAI: Their Python client explicitly handles this format
- vLLM: Sends
{"object":"error","message":"...","code":N}as a data event - SGLang: Sends
{"error":{"message":"...","type":"...","code":N}}as a data event - Exo: Sends
{"error":{"message":"...","type":"InternalServerError","code":500}}
Suggested fix
In response_to_streaming_message (around line 651 of providers/formats/openai.rs), before deserializing as StreamingChunk, parse the JSON and check for an error key. If found, return a ProviderError with the error message instead of a deserialization error.