Skip to content

feat: add OpenShift single-namespace deployment for separate pods#490

Closed
szedan-rh wants to merge 5 commits intovllm-project:mainfrom
szedan-rh:feature/openshift-single-namespace-deployment
Closed

feat: add OpenShift single-namespace deployment for separate pods#490
szedan-rh wants to merge 5 commits intovllm-project:mainfrom
szedan-rh:feature/openshift-single-namespace-deployment

Conversation

@szedan-rh
Copy link
Copy Markdown
Collaborator

Summary

This PR adds a complete single-namespace OpenShift deployment configuration for vLLM Semantic Router with a separate pods architecture.

Key Features

  • ✅ Single namespace deployment (vllm-semantic-router)
  • ✅ Separate pods architecture: router+envoy, model-a, model-b
  • ✅ External HTTPS access via OpenShift Route
  • ✅ Automated deployment script
  • ✅ Complete documentation

Components Added

  • BuildConfigs: for router and llm-katan model
  • ConfigMaps: Envoy and Router configuration
  • Deployments: All components (router, model-a, model-b)
  • PersistentVolumeClaims: Model storage
  • Services: Inter-pod communication
  • Route: External HTTPS access with TLS edge termination
  • Scripts: Automated deployment

Testing

Tested and verified working deployment with:

  • ✅ Successful model routing (math queries → Model-A, business queries → Model-B)
  • ✅ External HTTPS access via Route
  • ✅ All pods running and healthy (3/3)
  • ✅ Classification and routing working correctly
  • ✅ Full test results documented in README

Files Changed

  • 16 files total
  • 14 new files in deploy/openshift/single-namespace/
  • 1 new file: .dockerignore
  • 1 modified file: .github/workflows/docker-publish.yml

Documentation

Complete deployment guide included in deploy/openshift/single-namespace/README.md

Signed-off-by: szedan szedan@redhat.com

This PR adds a complete single-namespace OpenShift deployment configuration
for vLLM Semantic Router with separate pods architecture.

Key Features:
- Single namespace deployment (vllm-semantic-router)
- Separate pods: router+envoy, model-a, model-b
- External HTTPS access via OpenShift Route
- Automated deployment script
- Complete documentation

Components:
- BuildConfigs for router and llm-katan model
- ConfigMaps for Envoy and Router configuration
- Deployments for all components
- PersistentVolumeClaims for model storage
- Services for inter-pod communication
- Route for external HTTPS access with TLS termination
- Automated deployment script

Tested and verified working deployment with:
- Successful model routing (math → Model-A, business → Model-B)
- External HTTPS access via Route
- All pods running and healthy
- Classification and routing working correctly

Files Added:
- deploy/openshift/single-namespace/ (14 files)
- .dockerignore
- Updated .github/workflows/docker-publish.yml

Signed-off-by: szedan <senan.zedan@gmail.com>
Signed-off-by: szedan <szedan@redhat.com>
@netlify
Copy link
Copy Markdown

netlify bot commented Oct 20, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 4cee676
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68fe087459daba0008b12189
😎 Deploy Preview https://deploy-preview-490--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Oct 20, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • .dockerignore
  • DEPLOYMENT_GUIDE.md

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/openshift/single-namespace/ADD-MODEL-C-GUIDE.md
  • deploy/openshift/single-namespace/buildconfig-llm-katan.yaml
  • deploy/openshift/single-namespace/buildconfig-router.yaml
  • deploy/openshift/single-namespace/configmap-envoy.yaml
  • deploy/openshift/single-namespace/configmap-router.yaml
  • deploy/openshift/single-namespace/deployment-model-a.yaml
  • deploy/openshift/single-namespace/deployment-model-b.yaml
  • deploy/openshift/single-namespace/deployment-router.yaml
  • deploy/openshift/single-namespace/imagestreams.yaml
  • deploy/openshift/single-namespace/namespace.yaml
  • deploy/openshift/single-namespace/pvcs.yaml
  • deploy/openshift/single-namespace/route.yaml
  • deploy/openshift/single-namespace/scripts/check-gpu-status.sh
  • deploy/openshift/single-namespace/scripts/deploy-all.sh
  • deploy/openshift/single-namespace/scripts/test-queries-with-trace.sh
  • deploy/openshift/single-namespace/scripts/uninstall.sh
  • deploy/openshift/single-namespace/scripts/validate-deployment.sh
  • deploy/openshift/single-namespace/services.yaml

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@rootfs
Copy link
Copy Markdown
Collaborator

rootfs commented Oct 20, 2025

/assign @yossiovadia

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a comprehensive single-namespace OpenShift deployment configuration for vLLM Semantic Router with a separate pods architecture. The deployment uses three pods within one namespace: a router+envoy pod for classification and routing, and two separate model pods (model-a and model-b) running on GPUs. The configuration includes automated build processes, persistent storage, HTTPS external access, and complete documentation.

Key changes:

  • Complete OpenShift deployment manifests for single-namespace architecture with separate pods
  • Automated deployment script with build orchestration and monitoring
  • Comprehensive documentation with troubleshooting guides and testing instructions

Reviewed Changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
deploy/openshift/single-namespace/services.yaml Defines ClusterIP services for envoy-proxy, semantic-router, model-a, and model-b
deploy/openshift/single-namespace/scripts/deploy-all.sh Automated deployment script handling namespace creation, builds, and monitoring
deploy/openshift/single-namespace/route.yaml OpenShift Route for external HTTPS access with TLS edge termination
deploy/openshift/single-namespace/pvcs.yaml Persistent volume claims for router models/cache and model caches
deploy/openshift/single-namespace/namespace.yaml Namespace definition for vllm-semantic-router
deploy/openshift/single-namespace/imagestreams.yaml ImageStreams for semantic-router and llm-katan builds
deploy/openshift/single-namespace/deployment-router.yaml Deployment for router pod with init container for model downloads
deploy/openshift/single-namespace/deployment-model-b.yaml Deployment for model-b pod with GPU requirements
deploy/openshift/single-namespace/deployment-model-a.yaml Deployment for model-a pod with GPU requirements
deploy/openshift/single-namespace/configmap-router.yaml Router configuration with hardcoded ClusterIP addresses
deploy/openshift/single-namespace/configmap-envoy.yaml Envoy proxy configuration for routing and external processing
deploy/openshift/single-namespace/buildconfig-router.yaml BuildConfig for semantic-router image from binary source
deploy/openshift/single-namespace/buildconfig-llm-katan.yaml BuildConfig for llm-katan image from GitHub repository
deploy/openshift/single-namespace/README.md Comprehensive deployment guide with troubleshooting and testing instructions
.dockerignore Docker build exclusions for models, cache, and build artifacts

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@rootfs
Copy link
Copy Markdown
Collaborator

rootfs commented Oct 20, 2025

@szedan-rh thanks for contribution, can you run make markdown-lint-fix?

@szedan-rh
Copy link
Copy Markdown
Collaborator Author

@rootfs - Indeed.
Didn't mean to send the PR to review yet, it was done by mistake, apologize.

szedan-rh and others added 2 commits October 20, 2025 19:35
…ft deployment

This commit enhances the OpenShift single-namespace deployment with two major improvements:

1. Automatic IP Configuration:
   - Modified deploy-all.sh to automatically detect and configure ClusterIP addresses
   - Services are created first, then ClusterIPs are retrieved dynamically
   - Router ConfigMap is automatically updated with correct model service IPs
   - Deployment now works on any OpenShift cluster without manual IP configuration
   - Tested and verified with different IP allocations

2. Comprehensive Uninstall Script:
   - Added uninstall.sh for complete or partial removal of deployment
   - Interactive confirmations for destructive operations (PVCs, namespace)
   - Proper deletion order: Route → Deployments → Services → ConfigMaps → Builds → ImageStreams → PVCs → Namespace
   - Option to preserve data (PVCs) or namespace
   - Cleanup of temporary files
   - Tested successfully with full uninstall

3. Build Configuration Fix:
   - Fixed buildconfig-llm-katan.yaml to use inline Dockerfile
   - Changed from Git source (private repo) to official vLLM package
   - Uses vllm==0.6.3.post1 with PyTorch 2.1.0 and CUDA 12.1 support
   - OpenShift compatible (non-root user 1001)

4. Documentation Updates:
   - Moved deployment guide from README.md to DEPLOYMENT_GUIDE.md
   - Added comprehensive automatic IP configuration documentation
   - Added uninstall documentation with examples
   - Documented manual and automated deployment options

Testing Results:
- ✅ Deployment tested with automatic IP detection (IPs: 172.30.128.210, 172.30.17.224)
- ✅ Uninstall script tested with complete removal
- ✅ All pre-commit checks passing (11/11)
- ✅ Build configurations verified working

This makes the deployment production-ready and portable across different OpenShift clusters.

Signed-off-by: szedan <szedan@redhat.com>
…ion tools

This commit improves the OpenShift single-namespace deployment with several
key enhancements for better performance, usability, and maintainability.

Changes:
- Switch from heavy vLLM image to lightweight llm-katan image (~10GB → ~2-3GB)
- Remove semantic-router build (now uses pre-built ghcr.io image)
- Add GPU node status check and automated restart guidance
- Consolidate uninstall script confirmation prompts (3 → 1)
- Add route creation to deployment script for immediate external access

New validation tools:
- check-gpu-status.sh: Automated GPU node availability checking
- validate-deployment.sh: Comprehensive endpoint and flow validation
- test-queries-with-trace.sh: Detailed query routing visualization

Documentation:
- ADD-MODEL-C-GUIDE.md: Complete guide for adding additional models

These changes reduce deployment time from 20-30 minutes to 5-10 minutes
and provide comprehensive testing and troubleshooting capabilities.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: szedan <szedan@redhat.com>
@szedan-rh szedan-rh force-pushed the feature/openshift-single-namespace-deployment branch from 775429f to 3cf639b Compare October 21, 2025 05:24
@szedan-rh
Copy link
Copy Markdown
Collaborator Author

Will create one script that do the same things for openshift based on Yossi O. script.

@szedan-rh szedan-rh closed this Oct 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants