Add user-configurable MCP startup timeouts#6399
Add user-configurable MCP startup timeouts#6399calvinmclean wants to merge 26 commits intoobot-platform:mainfrom
Conversation
905b5db to
8f140d2
Compare
| } | ||
|
|
||
| func toolsForServer(ctx context.Context, mcpSessionManager *mcp.SessionManager, server v1.MCPServer, serverConfig mcp.ServerConfig, allowedTools []string) ([]types.MCPServerTool, error) { | ||
| ctx, cancel := context.WithTimeout(ctx, time.Minute) |
There was a problem hiding this comment.
timeout is removed because this calls mcpSessionManager.ListTools -> sm.clientForMCPServer -> sm.clientForMCPServerWithClientScope -> sm.clientForServerWithScope -> sm.clientForServerWithOptions -> sm.ensureDeployment -> sm.backend.ensureServerDeployment which is finally the Docker/K8s implementation which handles timeouts
|
|
||
| func ensureServerReady(ctx context.Context, url string, server ServerConfig) error { | ||
| // Ensure we can actually hit the service URL. | ||
| ctx, cancel := context.WithTimeout(ctx, time.Minute) |
There was a problem hiding this comment.
timeout removed because this is called by Docker/K8s implementation with a context timeout already
00c83c0 to
8ed2ce6
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
8ed2ce6 to
d6b052f
Compare
db8fe05 to
0a31e71
Compare
There was a problem hiding this comment.
This PR fixes bugs and improves code correctness across MCP configuration, pod name resolution, and wait logic.
pkg/mcp/kubernetes.go: Removed redundantctx.Done()select block; context cancellation handled byclient.Listpkg/mcp/types.go: Fixedelse iflogic to preventMaxMCPServerStartupTimeoutfrom validating against default (zero-value) startup timeoutspkg/wait/waitfor.go: Replaced nestedfor/selectwithfor rangeloop; moved unreachable error return to correct post-loop positionpkg/mcp/kubernetes_test.go: IntroducedfakeWithWatchto satisfy Watch interface in tests; updated goroutine-based event emission and aligned error message assertion with new retry timeout format
- This simplifies rather than checking and converting to Duration every time it is used
- This was originally removed to allow `ensureDeployment` to control its own timeout - Now I added timeouts to paths from toolsForServer that do not share the timeout with ensureDeployment
1556684 to
953674e
Compare
There was a problem hiding this comment.
Introduces a configurable startupTimeoutSeconds field (default 60s, max 600s) for MCP server startup across the full stack, from API types to UI forms, with proper validation and context propagation.
- Adds
StartupTimeoutSecondstoMCPServerManifest,MCPServerCatalogEntryManifest, andSystemMCPServerManifestAPI types and OpenAPI schemas - Introduces
StartupTimeout(time.Duration) inServerConfigwith default/max enforcement intypes.go - Propagates configurable timeout into Docker and Kubernetes backends, replacing hardcoded 1-minute values
- Fixes context propagation bugs in
backend.go(HTTP requests, SSE loop) and removes redundant fixed timeouts inmcp.go,client.go, andtools.go - Adds
validateStartupTimeoutinmcpvalidators.gowith integration into all three manifest validators - Adds UI input fields for
startupTimeoutSecondsin NPX, UVX, and containerized runtime forms with validation - Adds unit tests for pod status analysis, deployment watch timeout, and
startupTimeoutSecondsvalidation boundaries
Summary
Addresses #6185
Testing
In order to test this, I created a simple npx MCP server with a configurable sleep before serving:
This screenshot shows how the configuration can be set. In this case, the server will have a slow startup which would have previously timed-out. Now it will still succeed due to the 90s timeout.
