generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 209
Open
Labels
needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.
Description
What would you like to be added:
Following the spec of the model server: https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/docs/proposals/003-model-server-protocol
| Type | Metrics | Description |
|---|---|---|
| Gauge | sglang:num_running_reqs | The number of running requests |
| Gauge | sglang:num_queue_reqs | The number of requests in the waiting queue |
| Gauge | sglang:token_usage | The token usage |
| Counter | sglang:prompt_tokens_total | Number of prefill tokens processed. |
| Gauge | sglang:gen_throughput | The generate throughput |
Why is this needed:
SGlang is an widely adopted inference engine, like vLLM, supporting it will expand the scope of EPP.
Metadata
Metadata
Assignees
Labels
needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.