Skip to content

RobotFlow-Labs/project_greyhound

GREYHOUND Hero

ANIMA GREYHOUND — Wave 10 WARDOG

Paper: Adaptive Image Zoom-in with Bounding Box Transformation for UAV Object Detection ArXiv: https://arxiv.org/abs/2602.07512 GitHub: https://github.com/twangnh/zoomdet_code Defense Score: 41/50 | Tier: T2 Wave: 10 — WARDOG (War Dog Breeds) Focus: UAV/Drone Defense for Shenzhen Robot Fair Backbone Contract: YOLO26 only

Overview

GREYHOUND brings the paper's adaptive zoom front-end into the ANIMA stack and rebases the downstream detector onto the Wave-10 YOLO26 contract. The module is built end-to-end — model, training driver, evaluator, FastAPI service, Docker, and a ROS2 node — and is ready to train the moment the NIGHTHAWK UAV mega dataset finishes rendering on the shared GPU server.

Architecture

  1. Input image → OffsetNet (truncated ResNet18 after block 2) predicts low-resolution 2D offsets.
  2. Offsets upsample to the detector input size and convert into a normalized sampling grid.
  3. Bilinear warp produces a zoomed image; ground-truth boxes are corner-aligned into the zoomed frame during training.
  4. YOLO26 runs inference on the zoomed image.
  5. Predictions map back to the original frame for downstream consumers (ROS2 detections, FastAPI /predict).

Quick Start

# Install dependencies (CUDA 12.8 wheels via project pyproject.toml)
uv pip install -e ".[dev]"

# Synthetic dry run (no weights required)
python -m anima_greyhound --dry-run

# Serve the FastAPI app
python -m anima_greyhound.serve --host 0.0.0.0 --port 8000
curl http://localhost:8000/health

# Container
docker compose -f docker/docker-compose.yml up --build

Training

Training is gated by --confirm-gpu-free while NIGHTHAWK is generating the UAV mega dataset. Until the DONE.flag appears the training script only runs preflight:

python scripts/train.py --config configs/paper.toml
# Once NIGHTHAWK is done and /gpu-batch-finder confirms memory:
python scripts/train.py --config configs/paper.toml --confirm-gpu-free \
  --data /mnt/train-data/datasets/nighthawk_mega_highres/data.yaml

All training artifacts land under /mnt/artifacts-datai/{checkpoints,logs,exports}/project_greyhound/.

Project Structure

project_greyhound/
├── src/anima_greyhound/    # Zoom transform, datasets, eval, API, ROS2, monitoring
├── scripts/                # train.py, export.py, download_data.sh
├── launch/                 # ROS2 launch descriptions
├── prds/                   # 7-PRD ANIMA build suite
├── tasks/                  # Granular implementation breakdown
├── tests/                  # Unit tests (geometry, zoom, eval, api, monitoring)
├── configs/                # default.toml, paper.toml, debug.toml
├── docker/                 # Dockerfile + docker-compose.yml
├── papers/                 # Paper PDF
├── anima_module.yaml       # ANIMA registry manifest
├── ASSETS.md               # Data + weights manifest
├── PRD.md                  # Module-level contract
├── TRAINING_REPORT.md      # Pre-training snapshot
└── NEXT_STEPS.md           # Execution ledger

Dual Compute

All first-party code accepts mlx, cuda, or cpu via device.py. CUDA is the default training target on the shared L4 GPU server.

Verification Summary

  • Paper PDF read from papers/2602.07512.pdf
  • Reference repo cloned into references/zoomdet_code
  • Public repo verified as Faster R-CNN/MMDetection branch
  • YOLO branch referenced but not locally verified (ANIMA owns the YOLO26 rebase)
  • Shared dataset volume exists; VisDrone / UAVDT / SeaDroneSee / DroneVehicle provisioning pending

License

Research use only. See paper for original license terms.

About

GREYHOUND -- ZoomDet: Adaptive Image Zoom-in for UAV Object Detection

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors