SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)
-
Updated
May 18, 2026 - Python
SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)
Research paper and technical notes on the microservice ecosystem
Self-adaptive data layout for distribute joins
Add a description, image, and links to the workload-prediction topic page so that developers can more easily learn about it.
To associate your repository with the workload-prediction topic, visit your repo's landing page and select "manage topics."