Skip to content

Add SGLang Support #1141

@Xunzhuo

Description

@Xunzhuo

What would you like to be added:

Following the spec of the model server: https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/003-model-server-protocol

Type Metrics Description
Gauge sglang:num_running_reqs The number of running requests
Gauge sglang:num_queue_reqs The number of requests in the waiting queue
Gauge sglang:token_usage The token usage
Counter sglang:prompt_tokens_total Number of prefill tokens processed.
Gauge sglang:gen_throughput The generate throughput

Why is this needed:

SGlang is an widely adopted inference engine, like vLLM, supporting it will expand the scope of EPP.

Metadata

Metadata

Labels

needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions