Skip to content

fix(community): fix real-time streaming buffer bug for DeepInfra LLM Inference Provider#559

Open
Sumit Kumar (Sumitk99) wants to merge 1 commit intolangchain-ai:mainfrom
Sumitk99:fix/deepinfra-streaming-buffer-bug
Open

fix(community): fix real-time streaming buffer bug for DeepInfra LLM Inference Provider#559
Sumit Kumar (Sumitk99) wants to merge 1 commit intolangchain-ai:mainfrom
Sumitk99:fix/deepinfra-streaming-buffer-bug

Conversation

@Sumitk99
Copy link

Description:
In DeepInfra (langchain_community/llms/deepinfra.py), the _stream() and
_astream() methods called response.text / await response.text() before
iterating over the SSE stream. This consumed the entire HTTP response body
upfront — so all tokens were buffered in memory and delivered at once. It looked
like streaming on the surface, but every chunk arrived with Δ0.000s between
them (i.e. a bulk response in disguise).
Root cause:

# Buggy — reads the entire body, stream is now exhausted
response_text = response.text
self._handle_body_errors(response_text)
self._handle_status(response.status_code, response.text)
for line in _parse_stream(response.iter_lines()):  # nothing left to iterate
    ...

Fix:

Pass stream=True to request.post() to enable HTTP-level chunked transfer
Remove the pre-iteration response.text call; check HTTP status without reading the body
Move server-side error detection inside the per-chunk loop

# Fixed — status checked without consuming body; chunks arrive in real-time
response = request.post(url=..., data=..., stream=True)
self._handle_status(response.status_code, "")
for line in _parse_stream(response.iter_lines()):
    if line and "error" in line:
        self._handle_body_errors(line)
    ...

Issue: N/A
Dependencies: None — no new dependencies introduced.

…nd ChatDeepInfra, the code in main branch starts streaming after it has consumed all tokens, it looks file streaming but it doesn't, fixed it now
@github-actions github-actions bot added the fix label Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant