Add configurable LLM timeout via environment variables by tomaszzmuda · Pull Request #1281 · khoj-ai/khoj

tomaszzmuda · 2026-03-18T16:01:07Z

Problem

The LLM timeout is currently hardcoded with an arbitrary distinction between "local" (300s) and "remote" (60s) APIs based on whether the API base URL is localhost/127.0.0.1. This causes issues in real-world deployments:

Docker containers on internal networks are "local" in a network sense but not on localhost, so they get the 60s timeout
Local LLM instances with large models may need longer than 300s for generation
Remote APIs with high latency may exceed the 60s timeout
Users cannot adapt to their specific infrastructure without code changes

Solution

Add two environment variables to configure LLM timeouts, removing the localhost distinction entirely:

Variable	Default	Description
`KHOJ_LLM_TIMEOUT_READ`	`60`	Read timeout for all LLM API calls
`KHOJ_LLM_TIMEOUT_CONNECT`	`30`	Connection timeout for all LLM API calls

Implementation

Added a get_llm_timeout() helper function in src/khoj/processor/conversation/openai/utils.py that reads these environment variables and returns an httpx.Timeout configuration. All 5 timeout usages in the file now call this helper.

Backward Compatibility

✅ Default 60s read timeout matches previous "remote" behavior
✅ No breaking changes to API or function signatures
✅ Users who relied on 300s "local" timeout can set KHOJ_LLM_TIMEOUT_READ=300

---EDIT---
After timeout happen application tried to inform frontend but if websocket was closed it hung up so I put a small fix for that

debanjum · 2026-03-19T19:06:34Z

src/khoj/routers/api_chat.py

-                        await websocket.send_text(chunks)
-                        await websocket.send_text(ChatEvent.END_EVENT.value)
+                        try:
+                            await websocket.send_text(chunks)
+                            await websocket.send_text(ChatEvent.END_EVENT.value)
+                        except RuntimeError:
+                            pass  # WebSocket already closed



Is this try/catch in delayed_flush func really required? What happens without it?

My app froze and started being irresponsible. I've been testing that fix whole day and in that fixes that. Frontend wans't work nor GET on / so my Kubernetes kills the app. It was like really bad first impress of the app because I'm using rather slow offline model.

debanjum · 2026-03-19T19:17:50Z

src/khoj/processor/conversation/openai/utils.py

+        httpx.Timeout configured with appropriate values
+    """
+    connect_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_CONNECT", "30"))
+    read_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_READ", "60"))


Should we still set a longer default llm read timeout when using local ai models?

Suggested change

read_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_READ", "60"))

default_read_timeout = 300 if is_local_api(api_base_url) else 60

read_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_READ", default_read_timeout))

Note: The function description comment above and the setup.mdx updates will need to be updated to reflect the updated defaults (i.e read timeout will be 300 for local ai, 60 otherwise)

IMO having method like "is_local_api" is not a good approach. It's really hard to define what is "local" and what's not. In my case I run LLM in separate docker container and that code check only localhost or 127.0.0.1. On other end I have some models on Azure AI Foundry which really small rate limits and it constanly produces responses after that 60 second.
That should be defined here as global variable (and I know that is kind of breaking change) but would be much more convienent for new users. Possible next step would be to add some timeout on UI during defining model.

Of course it’s up to you, and I’ll be fine whether the timeout is set to 300 or not. It’s just really confusing that you have to check in the code whether it’s a specific domain for the application to behave differently.

tomaszzmuda added 2 commits March 18, 2026 16:57

Add configurable LLM timeout via environment variables

8b81f21

Do not break application after timeout

cdd2302

debanjum reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add configurable LLM timeout via environment variables#1281

Add configurable LLM timeout via environment variables#1281
tomaszzmuda wants to merge 2 commits intokhoj-ai:masterfrom
tomaszzmuda:timeouts

tomaszzmuda commented Mar 18, 2026 •

edited

Loading

Uh oh!

debanjum Mar 19, 2026

Uh oh!

tomaszzmuda Mar 19, 2026

Uh oh!

debanjum Mar 19, 2026

Uh oh!

tomaszzmuda Mar 19, 2026

Uh oh!

tomaszzmuda Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	read_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_READ", "60"))
	default_read_timeout = 300 if is_local_api(api_base_url) else 60
	read_timeout = float(os.getenv("KHOJ_LLM_TIMEOUT_READ", default_read_timeout))

Uh oh!

Conversation

tomaszzmuda commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Implementation

Backward Compatibility

Uh oh!

debanjum Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

tomaszzmuda Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

debanjum Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

tomaszzmuda Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

tomaszzmuda Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tomaszzmuda commented Mar 18, 2026 •

edited

Loading