Skip to content

AIM-Harvard/TumorImagingBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TumorImagingBench

TumorImagingBench is a framework for extracting foundation model embeddings from medical images and benchmarking them across radiomics datasets.

Overview

  • Unified interface for multiple foundation model extractors.
  • Dataset-specific feature extraction pipelines.
  • Analysis workflows in notebooks for performance, robustness, and stability.

Repository Structure

TumorImagingBench/
├── src/tumorimagingbench/     # Core package (models, evaluation)
├── scripts/                   # Utility scripts
├── tutorials/                 # Tutorials and guides
├── notebooks/                 # Analysis notebooks
├── data/                      # Datasets (ignored by git)
├── dist/                      # Large weights (ignored by git)
├── metrics/                   # Evaluation outputs
└── plots/                     # Figures and plots

Installation

uv sync
uv run python -m pip install -e .

Python requirement: >=3.10,<3.12.

Quickstart List available extractors:

from tumorimagingbench.models import get_available_extractors

print(get_available_extractors())

Load a model and initialize weights:

from tumorimagingbench.models import get_extractor

Model = get_extractor("VISTA3DExtractor")
model = Model()
model.load()

Feature Extraction Example using the LUNA16 extractor:

uv run python src/tumorimagingbench/evaluation/luna_feature_extractor.py \
  --output features/luna.pkl \
  --train-csv /path/to/train.csv \
  --val-csv /path/to/val.csv \
  --test-csv /path/to/test.csv

Notes:

  • Dataset CSVs should include image_path, coordX, coordY, coordZ (and optional labels).
  • Many extractors ship with absolute default paths; override them via flags.
  • Feature extraction expects a CUDA-capable GPU.

Supported Models

  • CTClipVitExtractor
  • CTFMExtractor
  • FMCIBExtractor
  • MedImageInsightExtractor
  • MerlinExtractor
  • ModelsGenExtractor
  • PASTAExtractor
  • SUPREMExtractor
  • VISTA3DExtractor
  • VocoExtractor

Supported Datasets

  • LUNA16
  • DLCS (Duke Lung Cancer Screening)
  • NSCLC Radiomics
  • NSCLC Radiogenomics
  • C4C-KiTS
  • Colorectal Liver Metastases
  • LNDb
  • RIDER (test-retest stability)

Tutorials

  • See tutorials/README.md for guided notebooks and dataset/model integration walkthroughs.

Evaluation

  • Example modelling workflow: notebooks/modelling/luna_modelling.ipynb (LUNA16 evaluation notebook).
  • Loads extracted features from data/features/luna.pkl and evaluates per-model performance.
  • Baselines: k-NN probing with AUC and 95% CI; linear probing (logistic regression); few-shot (1/5/10-shot).
  • Visual outputs saved to plots/ (e.g., luna_auc.png, luna_knn_overlap.png, luna_few_shot.png, luna_evaluation_protocols.png).
  • Ensemble methods: alignment-weighted k-NN and stacking meta-learner; weight and comparison plots (e.g., luna_alignment_weights.png, luna_stacking_weights.png, luna_ensemble_comparison.png, luna_ensemble_vs_individual.png).
  • Aggregates results into overall_results.csv.

Contributing

  • Follow the existing code style and update docs with changes.
  • Add targeted tests for new functionality.

Citation

@article{TumorImagingBench,
  title={Foundation model embeddings for quantitative tumor imaging biomarkers},
  author={},
  journal={},
  year={},
  volume={},
  pages={},
  publisher={}
}

License MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors