Instance segmentation model framework for Formula Student cone detection.
All models are fine-tuned on the FSOCO segmentation dataset (CVAT XML format, converted to COCO JSON).
| Model | Architecture | mAP@50 | mAP@50:95 | Precision | Recall | Status |
|---|---|---|---|---|---|---|
| RF-DETR Small | DINOv2 backbone | 80.4% | 55.7% | 89.9% | 75.8% | Trained |
Download pre-trained weights and place them in the corresponding model directory.
| Model | Weights | Size | Location |
|---|---|---|---|
| RF-DETR | checkpoint_best_ema.pth | 388 MB | models/rfdetr/weights/segmentation/ |
# Default model (rfdetr)
python inference/realtime.py
# Specify model explicitly
python inference/realtime.py --model rfdetr
# List available models
python inference/realtime.py --list-models
# Custom settings
python inference/realtime.py --model rfdetr --camera 1 --resolution 1920x1080 --conf-threshold 0.7Keyboard Controls:
q- Quits- Save screenshotm- Toggle mask overlayb- Toggle bounding boxesc- Toggle confidence scoresl- Toggle legendd- Toggle detection counts+/-- Adjust confidence thresholdSPACE- Pause/Resume
# Single image
python inference/image.py --model rfdetr --image path/to/image.jpg
# Folder of images
python inference/image.py --model rfdetr --image path/to/folder/ --output output/
# Options
python inference/image.py --model rfdetr --image test.jpg --no-masks # Bounding boxes only
python inference/image.py --model rfdetr --image test.jpg --no-boxes # Masks onlySegmentation_models/
├── inference/ # Unified inference system
│ ├── base.py # Abstract model interface
│ ├── registry.py # Model loader
│ ├── realtime.py # Real-time webcam inference
│ └── image.py # Single/batch image inference
│
├── common/ # Shared utilities
│ ├── visualization.py # Drawing masks, boxes, legends
│ ├── data/
│ │ ├── convert_yolo_to_coco.py # YOLO → COCO conversion
│ │ └── convert_cvat_to_coco.py # CVAT XML (RLE) → COCO conversion
│ └── tools/
│ └── visualize_dataset.py # Dataset visualization
│
├── models/ # Model implementations
│ └── rfdetr/
│ ├── adapter.py # RF-DETR adapter
│ ├── training/
│ │ ├── train_local.py
│ │ └── train_modal.py
│ └── weights/
│ └── segmentation/
│ └── checkpoint_best_ema.pth
│
├── results/ # Model evaluation metrics (JSON)
│
├── output/ # Inference output images
│
└── config/
└── models.yaml # Model registry
Training is model-specific:
# RF-DETR local training
python models/rfdetr/training/train_local.py
# RF-DETR cloud training (Modal)
modal run models/rfdetr/training/train_modal.py# Convert CVAT XML segmentation dataset (RLE masks) to COCO format
python common/data/convert_cvat_to_coco.py
# Convert YOLO segmentation dataset to COCO format
python common/data/convert_yolo_to_coco.py
# Visualize dataset annotations
python common/tools/visualize_dataset.py --dataset-dir /path/to/dataset --split train-
Create model directory:
models/<model_name>/ -
Implement the adapter in
models/<model_name>/adapter.py:from inference.base import BaseSegmentationModel class MyModelAdapter(BaseSegmentationModel): @property def name(self) -> str: return "mymodel" def load(self, weights_path, device="cuda", num_classes=4): # Load your model pass def predict(self, image, conf_threshold=0.5): # Return dict with: boxes, masks, class_ids, scores, inference_time_ms pass
-
Register in
config/models.yaml:models: mymodel: module: models.mymodel.adapter class: MyModelAdapter default_weights: models/mymodel/weights/best.pth num_classes: 4 description: My custom segmentation model
-
Add training scripts in
models/<model_name>/training/
pip install torch torchvision opencv-python numpy pyyaml
# Model-specific
pip install rfdetr # For RF-DETRThe models detect four cone types:
- Blue Cone (class 0) - Track boundary
- Yellow Cone (class 1) - Track boundary
- Large Orange Cone (class 2) - Special marker
- Orange Cone (class 3) - Track boundary