- Build with
make vllm-sr-dev - Start with
vllm-sr serve --image-pull-policy never - Use this for the default local Docker workflow
- Default smoke config: config.agent-smoke.cpu.yaml
- If you need a non-default config, run
make agent-serve-local ENV=cpu AGENT_SERVE_CONFIG=<config> - For isolated parallel local stacks, add
AGENT_STACK_NAME=<name>andAGENT_PORT_OFFSET=<n>, for example:make agent-serve-local ENV=cpu AGENT_STACK_NAME=lane-a AGENT_PORT_OFFSET=0andmake agent-serve-local ENV=cpu AGENT_STACK_NAME=lane-b AGENT_PORT_OFFSET=200 - Use the same
AGENT_STACK_NAMEandAGENT_PORT_OFFSETvalues withmake agent-smoke-localandmake agent-stop-local
- Build with
make vllm-sr-dev VLLM_SR_PLATFORM=amd - Start with
vllm-sr serve --image-pull-policy never --platform amd - Use this for ROCm/AMD validation and platform-default image checks
- Default smoke config: config.agent-smoke.amd.yaml
- If you need a non-default config, run
make agent-serve-local ENV=amd AGENT_SERVE_CONFIG=<config> - The same
AGENT_STACK_NAME=<name>andAGENT_PORT_OFFSET=<n>overrides work for isolated AMD-local stacks - For real AMD model deployment and backend container setup, read deploy/amd/README.md
- Use deploy/amd/config.yaml as the reference YAML-first AMD routing profile
- See amd-local.md
- Run local profile checks with
make e2e-test E2E_PROFILE=<profile> - CI expands to the standard kind/Kubernetes matrix in integration-test-k8s.yml
- Default to
cpu-local - Use
amd-localwhen platform behavior, ROCm image selection, or AMD defaults are affected - Use
ci-k8sfor merge-gate coverage and all profile-sensitive routing/deploy behavior