feat: add OpenShift single-namespace deployment for separate pods by szedan-rh · Pull Request #490 · vllm-project/semantic-router

szedan-rh · 2025-10-20T15:21:41Z

Summary

This PR adds a complete single-namespace OpenShift deployment configuration for vLLM Semantic Router with a separate pods architecture.

Key Features

✅ Single namespace deployment (vllm-semantic-router)
✅ Separate pods architecture: router+envoy, model-a, model-b
✅ External HTTPS access via OpenShift Route
✅ Automated deployment script
✅ Complete documentation

Components Added

BuildConfigs: for router and llm-katan model
ConfigMaps: Envoy and Router configuration
Deployments: All components (router, model-a, model-b)
PersistentVolumeClaims: Model storage
Services: Inter-pod communication
Route: External HTTPS access with TLS edge termination
Scripts: Automated deployment

Testing

Tested and verified working deployment with:

✅ Successful model routing (math queries → Model-A, business queries → Model-B)
✅ External HTTPS access via Route
✅ All pods running and healthy (3/3)
✅ Classification and routing working correctly
✅ Full test results documented in README

Files Changed

16 files total
14 new files in deploy/openshift/single-namespace/
1 new file: .dockerignore
1 modified file: .github/workflows/docker-publish.yml

Documentation

Complete deployment guide included in deploy/openshift/single-namespace/README.md

Signed-off-by: szedan szedan@redhat.com

This PR adds a complete single-namespace OpenShift deployment configuration for vLLM Semantic Router with separate pods architecture. Key Features: - Single namespace deployment (vllm-semantic-router) - Separate pods: router+envoy, model-a, model-b - External HTTPS access via OpenShift Route - Automated deployment script - Complete documentation Components: - BuildConfigs for router and llm-katan model - ConfigMaps for Envoy and Router configuration - Deployments for all components - PersistentVolumeClaims for model storage - Services for inter-pod communication - Route for external HTTPS access with TLS termination - Automated deployment script Tested and verified working deployment with: - Successful model routing (math → Model-A, business → Model-B) - External HTTPS access via Route - All pods running and healthy - Classification and routing working correctly Files Added: - deploy/openshift/single-namespace/ (14 files) - .dockerignore - Updated .github/workflows/docker-publish.yml Signed-off-by: szedan <senan.zedan@gmail.com> Signed-off-by: szedan <szedan@redhat.com>

netlify · 2025-10-20T15:21:47Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`4cee676`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68fe087459daba0008b12189
😎 Deploy Preview	https://deploy-preview-490--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-10-20T15:21:53Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

.dockerignore
DEPLOYMENT_GUIDE.md

📁 `deploy`

Owners: @rootfs, @Xunzhuo
Files changed:

deploy/openshift/single-namespace/ADD-MODEL-C-GUIDE.md
deploy/openshift/single-namespace/buildconfig-llm-katan.yaml
deploy/openshift/single-namespace/buildconfig-router.yaml
deploy/openshift/single-namespace/configmap-envoy.yaml
deploy/openshift/single-namespace/configmap-router.yaml
deploy/openshift/single-namespace/deployment-model-a.yaml
deploy/openshift/single-namespace/deployment-model-b.yaml
deploy/openshift/single-namespace/deployment-router.yaml
deploy/openshift/single-namespace/imagestreams.yaml
deploy/openshift/single-namespace/namespace.yaml
deploy/openshift/single-namespace/pvcs.yaml
deploy/openshift/single-namespace/route.yaml
deploy/openshift/single-namespace/scripts/check-gpu-status.sh
deploy/openshift/single-namespace/scripts/deploy-all.sh
deploy/openshift/single-namespace/scripts/test-queries-with-trace.sh
deploy/openshift/single-namespace/scripts/uninstall.sh
deploy/openshift/single-namespace/scripts/validate-deployment.sh
deploy/openshift/single-namespace/services.yaml

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

rootfs · 2025-10-20T15:27:57Z

/assign @yossiovadia

Copilot

Pull Request Overview

This PR introduces a comprehensive single-namespace OpenShift deployment configuration for vLLM Semantic Router with a separate pods architecture. The deployment uses three pods within one namespace: a router+envoy pod for classification and routing, and two separate model pods (model-a and model-b) running on GPUs. The configuration includes automated build processes, persistent storage, HTTPS external access, and complete documentation.

Key changes:

Complete OpenShift deployment manifests for single-namespace architecture with separate pods
Automated deployment script with build orchestration and monitoring
Comprehensive documentation with troubleshooting guides and testing instructions

Reviewed Changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
deploy/openshift/single-namespace/services.yaml	Defines ClusterIP services for envoy-proxy, semantic-router, model-a, and model-b
deploy/openshift/single-namespace/scripts/deploy-all.sh	Automated deployment script handling namespace creation, builds, and monitoring
deploy/openshift/single-namespace/route.yaml	OpenShift Route for external HTTPS access with TLS edge termination
deploy/openshift/single-namespace/pvcs.yaml	Persistent volume claims for router models/cache and model caches
deploy/openshift/single-namespace/namespace.yaml	Namespace definition for vllm-semantic-router
deploy/openshift/single-namespace/imagestreams.yaml	ImageStreams for semantic-router and llm-katan builds
deploy/openshift/single-namespace/deployment-router.yaml	Deployment for router pod with init container for model downloads
deploy/openshift/single-namespace/deployment-model-b.yaml	Deployment for model-b pod with GPU requirements
deploy/openshift/single-namespace/deployment-model-a.yaml	Deployment for model-a pod with GPU requirements
deploy/openshift/single-namespace/configmap-router.yaml	Router configuration with hardcoded ClusterIP addresses
deploy/openshift/single-namespace/configmap-envoy.yaml	Envoy proxy configuration for routing and external processing
deploy/openshift/single-namespace/buildconfig-router.yaml	BuildConfig for semantic-router image from binary source
deploy/openshift/single-namespace/buildconfig-llm-katan.yaml	BuildConfig for llm-katan image from GitHub repository
deploy/openshift/single-namespace/README.md	Comprehensive deployment guide with troubleshooting and testing instructions
.dockerignore	Docker build exclusions for models, cache, and build artifacts

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

deploy/openshift/single-namespace/configmap-router.yaml

deploy/openshift/single-namespace/scripts/deploy-all.sh

deploy/openshift/single-namespace/deployment-model-a.yaml

deploy/openshift/single-namespace/deployment-model-b.yaml

deploy/openshift/single-namespace/deployment-router.yaml

deploy/openshift/single-namespace/README.md

rootfs · 2025-10-20T15:32:00Z

@szedan-rh thanks for contribution, can you run make markdown-lint-fix?

szedan-rh · 2025-10-20T15:34:07Z

@rootfs - Indeed.
Didn't mean to send the PR to review yet, it was done by mistake, apologize.

…ft deployment This commit enhances the OpenShift single-namespace deployment with two major improvements: 1. Automatic IP Configuration: - Modified deploy-all.sh to automatically detect and configure ClusterIP addresses - Services are created first, then ClusterIPs are retrieved dynamically - Router ConfigMap is automatically updated with correct model service IPs - Deployment now works on any OpenShift cluster without manual IP configuration - Tested and verified with different IP allocations 2. Comprehensive Uninstall Script: - Added uninstall.sh for complete or partial removal of deployment - Interactive confirmations for destructive operations (PVCs, namespace) - Proper deletion order: Route → Deployments → Services → ConfigMaps → Builds → ImageStreams → PVCs → Namespace - Option to preserve data (PVCs) or namespace - Cleanup of temporary files - Tested successfully with full uninstall 3. Build Configuration Fix: - Fixed buildconfig-llm-katan.yaml to use inline Dockerfile - Changed from Git source (private repo) to official vLLM package - Uses vllm==0.6.3.post1 with PyTorch 2.1.0 and CUDA 12.1 support - OpenShift compatible (non-root user 1001) 4. Documentation Updates: - Moved deployment guide from README.md to DEPLOYMENT_GUIDE.md - Added comprehensive automatic IP configuration documentation - Added uninstall documentation with examples - Documented manual and automated deployment options Testing Results: - ✅ Deployment tested with automatic IP detection (IPs: 172.30.128.210, 172.30.17.224) - ✅ Uninstall script tested with complete removal - ✅ All pre-commit checks passing (11/11) - ✅ Build configurations verified working This makes the deployment production-ready and portable across different OpenShift clusters. Signed-off-by: szedan <szedan@redhat.com>

…ion tools This commit improves the OpenShift single-namespace deployment with several key enhancements for better performance, usability, and maintainability. Changes: - Switch from heavy vLLM image to lightweight llm-katan image (~10GB → ~2-3GB) - Remove semantic-router build (now uses pre-built ghcr.io image) - Add GPU node status check and automated restart guidance - Consolidate uninstall script confirmation prompts (3 → 1) - Add route creation to deployment script for immediate external access New validation tools: - check-gpu-status.sh: Automated GPU node availability checking - validate-deployment.sh: Comprehensive endpoint and flow validation - test-queries-with-trace.sh: Detailed query routing visualization Documentation: - ADD-MODEL-C-GUIDE.md: Complete guide for adding additional models These changes reduce deployment time from 20-30 minutes to 5-10 minutes and provide comprehensive testing and troubleshooting capabilities. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: szedan <szedan@redhat.com>

…ace-deployment

szedan-rh · 2025-10-26T11:40:44Z

Will create one script that do the same things for openshift based on Yossi O. script.

szedan-rh requested review from Xunzhuo and rootfs as code owners October 20, 2025 15:21

github-actions bot assigned rootfs and Xunzhuo Oct 20, 2025

github-actions bot assigned yossiovadia Oct 20, 2025

rootfs requested review from Copilot and yossiovadia October 20, 2025 15:28

Copilot AI reviewed Oct 20, 2025

View reviewed changes

szedan-rh and others added 2 commits October 20, 2025 19:35

szedan-rh force-pushed the feature/openshift-single-namespace-deployment branch from 775429f to 3cf639b Compare October 21, 2025 05:24

szedan-rh added 2 commits October 21, 2025 08:29

Merge branch 'main' into feature/openshift-single-namespace-deployment

83ffb3e

Merge branch 'vllm-project:main' into feature/openshift-single-namesp…

4cee676

…ace-deployment

szedan-rh closed this Oct 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add OpenShift single-namespace deployment for separate pods#490

feat: add OpenShift single-namespace deployment for separate pods#490
szedan-rh wants to merge 5 commits intovllm-project:mainfrom
szedan-rh:feature/openshift-single-namespace-deployment

szedan-rh commented Oct 20, 2025

Uh oh!

netlify bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

rootfs commented Oct 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rootfs commented Oct 20, 2025

Uh oh!

szedan-rh commented Oct 20, 2025

Uh oh!

szedan-rh commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

szedan-rh commented Oct 20, 2025

Summary

Key Features

Components Added

Testing

Files Changed

Documentation

Uh oh!

netlify bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 Root Directory

📁 deploy

🎉 Thanks for your contributions!

Uh oh!

rootfs commented Oct 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rootfs commented Oct 20, 2025

Uh oh!

szedan-rh commented Oct 20, 2025

Uh oh!

szedan-rh commented Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

netlify bot commented Oct 20, 2025 •

edited

Loading

github-actions bot commented Oct 20, 2025 •

edited

Loading

📁 `Root Directory`

📁 `deploy`