Skip to content

Add user-configurable MCP startup timeouts#6399

Open
calvinmclean wants to merge 26 commits intoobot-platform:mainfrom
calvinmclean:feature/mcp-server-timeouts
Open

Add user-configurable MCP startup timeouts#6399
calvinmclean wants to merge 26 commits intoobot-platform:mainfrom
calvinmclean:feature/mcp-server-timeouts

Conversation

@calvinmclean
Copy link
Copy Markdown
Contributor

@calvinmclean calvinmclean commented Apr 22, 2026

Summary

Addresses #6185

  • Add a configurable MCP server startup timeout to catalog entry, MCP server, and system MCP server manifests
  • Surface the timeout in the NPX, UVX, and containerized runtime configuration UI
  • Apply the configured timeout in Docker and Kubernetes MCP backend startup/readiness paths
  • Separate image-pull waiting from container startup waiting so slow image pulls do not consume the user-configured startup timeout
    • K8s will also do early-exits when it detects unrecoverable failures/states

Testing

In order to test this, I created a simple npx MCP server with a configurable sleep before serving:

# sleep 30s before starting (default is 60)
npx github:calvinmclean/slow-npx-mcp 30

This screenshot shows how the configuration can be set. In this case, the server will have a slow startup which would have previously timed-out. Now it will still succeed due to the 90s timeout.
Screenshot 2026-04-24 at 14 13 36

@calvinmclean calvinmclean force-pushed the feature/mcp-server-timeouts branch from 905b5db to 8f140d2 Compare April 22, 2026 22:16
Comment thread pkg/api/handlers/mcp.go
}

func toolsForServer(ctx context.Context, mcpSessionManager *mcp.SessionManager, server v1.MCPServer, serverConfig mcp.ServerConfig, allowedTools []string) ([]types.MCPServerTool, error) {
ctx, cancel := context.WithTimeout(ctx, time.Minute)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout is removed because this calls mcpSessionManager.ListTools -> sm.clientForMCPServer -> sm.clientForMCPServerWithClientScope -> sm.clientForServerWithScope -> sm.clientForServerWithOptions -> sm.ensureDeployment -> sm.backend.ensureServerDeployment which is finally the Docker/K8s implementation which handles timeouts

Comment thread pkg/mcp/backend.go

func ensureServerReady(ctx context.Context, url string, server ServerConfig) error {
// Ensure we can actually hit the service URL.
ctx, cancel := context.WithTimeout(ctx, time.Minute)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout removed because this is called by Docker/K8s implementation with a context timeout already

@calvinmclean calvinmclean force-pushed the feature/mcp-server-timeouts branch 6 times, most recently from 00c83c0 to 8ed2ce6 Compare April 27, 2026 23:44
@calvinmclean calvinmclean requested a review from Copilot April 27, 2026 23:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/mcp/kubernetes.go Outdated
Comment thread pkg/mcp/docker.go
Comment thread pkg/mcp/backend.go
Comment thread pkg/mcp/kubernetes.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@calvinmclean calvinmclean force-pushed the feature/mcp-server-timeouts branch from db8fe05 to 0a31e71 Compare April 28, 2026 21:06
@calvinmclean calvinmclean requested a review from Copilot April 28, 2026 21:09
Copy link
Copy Markdown

@entelligence-ai-pr-reviews entelligence-ai-pr-reviews Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR fixes bugs and improves code correctness across MCP configuration, pod name resolution, and wait logic.

  • pkg/mcp/kubernetes.go: Removed redundant ctx.Done() select block; context cancellation handled by client.List
  • pkg/mcp/types.go: Fixed else if logic to prevent MaxMCPServerStartupTimeout from validating against default (zero-value) startup timeouts
  • pkg/wait/waitfor.go: Replaced nested for/select with for range loop; moved unreachable error return to correct post-loop position
  • pkg/mcp/kubernetes_test.go: Introduced fakeWithWatch to satisfy Watch interface in tests; updated goroutine-based event emission and aligned error message assertion with new retry timeout format

Comment thread pkg/mcp/kubernetes_test.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@calvinmclean calvinmclean force-pushed the feature/mcp-server-timeouts branch from 1556684 to 953674e Compare May 1, 2026 20:28
Copy link
Copy Markdown

@entelligence-ai-pr-reviews entelligence-ai-pr-reviews Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduces a configurable startupTimeoutSeconds field (default 60s, max 600s) for MCP server startup across the full stack, from API types to UI forms, with proper validation and context propagation.

  • Adds StartupTimeoutSeconds to MCPServerManifest, MCPServerCatalogEntryManifest, and SystemMCPServerManifest API types and OpenAPI schemas
  • Introduces StartupTimeout (time.Duration) in ServerConfig with default/max enforcement in types.go
  • Propagates configurable timeout into Docker and Kubernetes backends, replacing hardcoded 1-minute values
  • Fixes context propagation bugs in backend.go (HTTP requests, SSE loop) and removes redundant fixed timeouts in mcp.go, client.go, and tools.go
  • Adds validateStartupTimeout in mcpvalidators.go with integration into all three manifest validators
  • Adds UI input fields for startupTimeoutSeconds in NPX, UVX, and containerized runtime forms with validation
  • Adds unit tests for pod status analysis, deployment watch timeout, and startupTimeoutSeconds validation boundaries

Comment thread pkg/mcp/loader.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants