Motivation
Currently, the faire share policy only support cpu and memory. When integrating flame with SGLang/vLLM/Pytorch, the GPU resources are requried. This feature is used to support GPU resources in Flame for inference and training workload.
Function Specification
- The executor manager will report GPU resources accordingly
- The session manager will support both faire-share by slots and DRF policy
- The SDK can request GPU resources
Solutions
- Support GPU in slots parsing, so the faire-share policy will support GPU
- Introduce a new DRF policy for GPU
Additional context
N/A
Motivation
Currently, the faire share policy only support cpu and memory. When integrating flame with SGLang/vLLM/Pytorch, the GPU resources are requried. This feature is used to support GPU resources in Flame for inference and training workload.
Function Specification
Solutions
Additional context
N/A