A modular, multi-level deep learning framework for evidence-driven differential diagnosis of malignant melanoma, integrating information at lesion, patient, and population levels using CNNs and masked transformers.
📄 Paper: MICCAI 2023 Proceedings | PDF
🎤 Presentation: Oral Presentation Slides
🌐 Project Page: https://cvit.iiit.ac.in/mip/projects/meldd/
This repository implements the MelDD (Melanoma Differential Diagnosis) framework presented at MICCAI ISIC 2023. Our approach addresses the clinical challenge of melanoma diagnosis by modeling the complete patient context rather than individual lesions in isolation.
- Multi-level Evidence Integration: Combines lesion-level features, patient context, and clinical metadata for holistic diagnosis
- Variable Lesion Count Handling: Uses masked self-attention to process patients with different numbers of lesions
- Clinical Decision Process Modeling: Mimics dermatologists' diagnostic reasoning that considers patient's complete skin ecosystem
- Context-Aware Diagnosis: Identifies "ugly duckling" lesions that appear suspicious within patient context but may be benign in isolation
Traditional melanoma detection systems analyze lesions independently, missing crucial contextual information that dermatologists use:
- Patient Context: A morphologically typical lesion may be suspicious if it differs from patient's other lesions
- Ugly Duckling Criteria: Lesions that stand out from a patient's typical pattern warrant attention
- Clinical Metadata: Patient demographics (age, sex) and anatomical site influence melanoma risk
The MelDD framework architecture showing the two-stage pipeline from individual lesion images to patient-level diagnosis through CNN feature extraction and transformer-based context modeling.
The MelDD framework consists of two complementary stages:
Raw Images → Stage 1: CNN Feature Extraction → Stage 2: Transformer Context Analysis → Diagnosis
↓ (ResNet101/DenseNet161) (Masked Self-Attention) ↓
Individual → Lesion Features → Patient Context Modeling → Lesion + Patient
Lesions Predictions
- Purpose: Extract rich feature representations from individual dermoscopic images
- Architecture: ResNet101/DenseNet161 pre-trained on SIIM-ISIC 2020 dataset
- Output: Dense feature vectors (2048D for ResNet101, 2208D for DenseNet161) per lesion
- Purpose: Model relationships between lesions within a patient's context
- Architecture: Multi-layer transformer encoder with masked self-attention
- Key Features:
- Masked Attention: Handles variable number of lesions per patient (1-50+ lesions)
- Positional Encoding: Optional anatomical site information integration
- Clinical Metadata: Age and sex embedding for enhanced context
- Dual Output: Both lesion-level and patient-level predictions
We present three variants demonstrating the impact of different information sources:
| Variant | Features | Performance Focus |
|---|---|---|
| MelDD-V1 | Lesion features only | Baseline context modeling |
| MelDD-V2 | Lesion features + anatomical site | Enhanced specificity |
| MelDD-V3 | Lesion features + patient metadata | Improved sensitivity |
- MelDD-V2: Higher specificity → Reduces false positives, fewer unnecessary biopsies
- MelDD-V3: Higher sensitivity → Better screening capability, fewer missed melanomas
- Variant Selection: Choose based on clinical priorities (screening vs. diagnostic setting)
- Python 3.8+
- PyTorch 1.12+
- CUDA-capable GPU (recommended)
- Clone the repository:
git clone https://github.com/your-username/melanoma-diagnosis.git
cd melanoma-diagnosis- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Install the package:
pip install -e .Organize your data as follows:
data/
├── images/
│ ├── train/
│ │ ├── ISIC_0000001.jpg
│ │ └── ...
│ └── test/
│ ├── ISIC_1000001.jpg
│ └── ...
├── lesion_data.csv
└── patient_metadata.csv (optional)
Lesion-level CSV (lesion_data.csv):
patient_id,image_name,target,anatom_site_general,age,sex
IP_0000001,ISIC_0000001,0,torso,45,male
IP_0000001,ISIC_0000002,1,torso,45,male
IP_0000002,ISIC_0000003,0,head/neck,67,femalepatient_id: Unique patient identifierimage_name: Image filename (without extension)target: Binary label (0=benign, 1=melanoma)anatom_site_general: Anatomical location (optional)age: Patient age (optional)sex: Patient sex (optional)
- Configure the framework:
# Edit configs/config.yaml to match your data paths and model parameters
vim configs/config.yaml- Train Stage 1 CNN models:
python scripts/train_stage1.py \
--data-csv data/lesion_data.csv \
--image-dir data/images/train \
--create-folds- Extract features:
python scripts/extract_features.py \
--models-dir weights/stage1 \
--image-dirs data/images/train data/images/test \
--output-dir data/features- Train Stage 2 Transformer models:
python scripts/train_stage2.py \
--lesion-csv data/lesion_data.csv \
--features-dir data/features \
--create-patient-data \
--create-folds- Run inference:
python scripts/run_inference.py \
--test-csv data/patient_test.csv \
--models-dir weights/stage2 \
--features-base-dir data/features \
--ensemble \
--output predictions.csvTrain CNN models for lesion-level feature extraction:
# Train all folds
python scripts/train_stage1.py \
--config configs/config.yaml \
--data-csv data/lesion_data.csv \
--image-dir data/images/train \
--create-folds
# Train specific fold
python scripts/train_stage1.py \
--fold 1 \
--data-csv data/lesion_data.csv \
--image-dir data/images/trainExtract features using trained CNN models:
# Extract features for all folds
python scripts/extract_features.py \
--models-dir weights/stage1 \
--image-dirs data/images/train data/images/test \
--output-dir data/features
# Extract features for specific fold
python scripts/extract_features.py \
--fold 1 \
--model-path weights/stage1/fold_1/best_model.pth \
--image-dirs data/images/train \
--output-dir data/featuresTrain transformer models for patient-level classification:
# Train all folds with patient data creation
python scripts/train_stage2.py \
--lesion-csv data/lesion_data.csv \
--features-dir data/features \
--create-patient-data \
--create-folds
# Train specific fold
python scripts/train_stage2.py \
--fold 1 \
--patient-csv data/patient_data.csv \
--features-dir data/featuresRun inference on test data:
# Ensemble inference (recommended)
python scripts/run_inference.py \
--test-csv data/patient_test.csv \
--models-dir weights/stage2 \
--features-base-dir data/features \
--ensemble \
--output predictions.csv
# Single model inference
python scripts/run_inference.py \
--model-path weights/stage2/fold_1/best_model.ckpt \
--test-csv data/patient_test.csv \
--features-dir data/features/fold_1 \
--output predictions.csvThe framework is configured via configs/config.yaml. Key parameters:
backbone: CNN architecture (resnet101, densenet161, efficientnet_b3)batch_size: Training batch sizelearning_rate: Learning rateimage_size: Input image size
model_dim: Transformer hidden dimensionnum_heads: Number of attention headsnum_layers: Number of transformer layersmax_sequence_length: Maximum lesions per patient
The framework supports three model variants:
- MelDD-V1: Lesion features only
- MelDD-V2: Lesion features + anatomical site information
- MelDD-V3: Lesion features + patient metadata (age, sex)
Configure variants by modifying:
stage2:
features:
location_encoding: true/false
use_metadata: true/falseThe framework optimizes and reports:
- AUC: Area under ROC curve
- Balanced Accuracy: Accounts for class imbalance
- Sensitivity/Specificity: Clinical relevance
- Youden's J: Optimal threshold selection
weights/
├── stage1/
│ ├── fold_1/
│ │ └── best_model_fold1_epoch15.pth
│ └── ...
└── stage2/
├── fold_1/
│ └── best_model_epoch34_auc0.8542.ckpt
└── ...
logs/
├── stage1_cnn/
└── stage2_transformer/
predictions.csv
- GPU: NVIDIA GPU with 8GB+ VRAM (recommended)
- RAM: 16GB+ system RAM
- Storage: 50GB+ for features and model weights
-
CUDA out of memory:
- Reduce batch size in config
- Use gradient checkpointing
- Enable mixed precision training
-
Missing features:
- Ensure feature extraction completed successfully
- Check file paths and permissions
-
Poor convergence:
- Adjust learning rate
- Check data preprocessing
- Verify class balance
If you use this code in your research, please cite:
@inproceedings{akash2023evidence,
title={Evidence-Driven Differential Diagnosis of Malignant Melanoma},
author={Naren Akash, Anirudh Kaushik, and Jayanthi Sivaswamy},
booktitle={Medical Image Computing and Computer-Assisted Intervention (MICCAI)},
year={2023},
organization={Springer}
}