Skip to content

arco-group/INSPECT-CS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 

Repository files navigation

INSPECT-CS

This is a code implemention of the framework proposed in the paper "Multimodal Clinical Data Integration for Prognosis of Pulmonary Embolism: A Comparative Study".

License: MIT
Paper

📌 Overview

This repository contains the official implementation of the paper:

"Multimodal Clinical Data Integration for Prognosis of Pulmonary Embolism: A Comparative Study"
Authors: Domenico Paolo, Paolo Soda, Matteo Tortora, Alessandro Bria, Rosa Sicilia.

We combine structured EHR data, clinical notes, and imaging features to improve risk prediction performance.


⚙️ Installation

git clone https://github.com/arco-group/INSPECT-CS.git
cd INSPECT-CS
pip install -r requirements.txt

🚀 Usage

The project is modular: you can train unimodal models (EHR-only, Report-only) or multimodal fusion models. We use Hydra, so you can override any parameter directly from the command line.

  • Unimodal Reports:

Extract Clinical Long-former features from reports:

python src/reports/run_featurize.py

Run mortality prediction task:

python src/reports/run_classify.py task=1_month_mortality exp_name=reports_run_0 seed=42
  • Unimodal Image:

Extract ResNetV2 features from images:

python src/image/run_featurize.py

Run mortality prediction task:

python src/image/run_classify.py model=model_1d dataset=stanford_featurized \
    dataset.csv_path=/mimer/NOBACKUP/groups/naiss2023-6-336/multimodal_os/PE-Insight/data/folds/unimodal_image/1_month_mortality.csv \
    dataset.target=1_month_mortality \
    dataset.pretrain_args.model_type=resnetv2_101_ct \
    dataset.pretrain_args.channel_type=window \
    dataset.feature_size=768 \
    dataset.num_slices=250 \
    model.aggregation=attention+max \
    model.seq_encoder.rnn_type=GRU \
    model.seq_encoder.bidirectional=true \
    model.seq_encoder.num_layers=1 \
    model.seq_encoder.hidden_size=128 \
    model.seq_encoder.dropout_prob=0.25 \
    dataset.weighted_sample=true \
    trainer.max_epochs=50 \
    lr=0.001 \
    trainer.seed=$seed \
    n_gpus=$n_gpus \
    trainer.strategy=ddp \
    dataset.batch_size=128 \
    trainer.num_workers=1 \
    dataset.num_slices=250 
  • Unimodal EHR-GBM:

Extract labels and features:

python src/ehr/1_csv_to_database.py --path_to_input /data/ehr/omop --path_to_target /data/ehr/output/inspect_femr_extract --athena_download /data/ehr/athena/ontology.pkl --num_threads 4

python src/ehr/2_generate_labels_and_features.py --path_to_cohort /data/cohort_0.2.0_master_file_anon.csv --path_to_database /data/ehr/output/inspect_femr_extract --path_to_output_dir /data/ehr/output/labels_and_features/1_month_mortality --labeling_function 1_month_mortality --num_threads 4

python src/ehr/filter_labeled_patients.py

Run mortality prediction task:

python 3_train_gbm.py --path_to_cohort /data/ehr/output/labels_and_features/1_month_mortality/filtered_cohort.csv --path_to_database /data/ehr/output/inspect_femr_extract --path_to_output_dir /data/ehr/output/labels_and_features/gbm_models --path_to_label_features /data/ehr/output/labels_and_features/1_month_mortality --num_threads 20

Do TruncatedSVD:

python TruncatedSVD.py
  • Unimodal EHR-AE:

Run mortality prediction task:

python src/ehr/run_classify.py
  • Late Fusion:

Run mortality prediction task:

python src/late/average_probs.py
  • Early Fusion:

Run mortality prediction task:

python src/multi/run_classify.py \
    task=1_month_mortality \
    exp_name=1_month_mortality_early_ehr1_image_report_0 \
    dataset.target=1_month_mortality \
    data.weighted_sample=true \
    trainer.epochs=50 \
    trainer.learning_rate=0.001 \
    trainer.alpha=0.0 \
    seed=0 \
    trainer.n_gpus=1 \
    trainer.strategy=ddp \
    trainer.batch_size=128 \
    model.fusion.add_contrast=false \
    model.name=early \
    dataset.num_slices=250 \
    model.fusion.fusion_method=concat \
    modalities="['image', 'report', 'ehr']"\
    model.ehr_size=128 \
  • Cross Fusion:

Run mortality prediction task:

python src/multi/run_classify.py \
    task=1_month_mortality \
    exp_name=1_month_mortality_cross_ehr1_image_report_0 \
    dataset.target=1_month_mortality \
    data.weighted_sample=true \
    trainer.epochs=50 \
    trainer.learning_rate=0.001 \
    trainer.alpha=0.5 \
    seed=0 \
    trainer.n_gpus=1 \
    trainer.strategy=ddp \
    trainer.batch_size=128 \
    model.fusion.add_contrast=true \
    model.name=cross \
    dataset.num_slices=250 \
    model.fusion.fusion_method=concat \
    modalities="['report', 'image', 'ehr']" \
    model.ehr_size=128 \
  • Armour Fusion:

Run mortality prediction task:

python src/multi/run_classify.py \
    task=1_month_mortality \
    exp_name=1_month_mortality_armour_ehr1_image_report_0 \
    dataset.target=1_month_mortality \
    data.weighted_sample=true \
    trainer.epochs=50 \
    trainer.learning_rate=0.001 \
    trainer.alpha=0.5 \
    seed=0 \
    trainer.n_gpus=1 \
    trainer.strategy=ddp \
    trainer.batch_size=128 \
    model.fusion.add_contrast=true \
    model.name=armour \
    dataset.num_slices=250 \
    model.fusion.fusion_method=concat \
    modalities="['report', 'image', 'ehr']" \
    model.ehr_size=128 \

🏗 Model Architecture

The framework integrates three distinct clinical data modalities using specialized encoders and various fusion strategies to optimize prognostic accuracy.

Modality Encoders

  • CT Imaging: Slices are processed using a ResNetV2-101 backbone (pretrained with BigTransfer). Slice-level features are aggregated via bidirectional GRU and a Hybrid Attention-and-Max Pooling mechanism.
  • Radiology Reports: Encoded using Clinical-Longformer to handle long-form clinical text. It employs a two-level hierarchical attention mechanism (token-level and sentence-level) to generate a 768-dimensional report embedding.
  • Structured EHR: Processed through a Supervised Autoencoder (EHR-AE) with two layers to learn task-adaptive representations. For tree-based baselines, a LightGBM model is also supported. Proposed Method

Fusion Strategies

  1. Late Fusion (MEAN): A robust strategy that averages the predicted probabilities from independent unimodal models. This approach demonstrated the most stable and highest performance (MCC) across different time horizons. Proposed Method
  2. Early Fusion: Features from all three modalities are concatenated into a single vector before being passed to a Multi-Layer Perceptron (MLP) classifier. Proposed Method
  3. Intermediate Fusion:
    • ARMOUR: Employs cross-attention and contrastive alignment to ensure robustness against missing modalities. Proposed Method
    • CROSS: Uses a hierarchy of Multi-Head Cross-Attention (MHCA) blocks to model complex inter-modality interactions.ù Proposed Method

📊 Dataset & Preprocessing

The study utilizes the INSPECT dataset, the first large-scale, public multimodal cohort for PE.

Dataset Statistics

  • Scope: 23,248 CTPA studies from 19,402 unique patients.
  • Targets: All-cause mortality at 1-month, 6-month, and 12-month intervals.
  • Splitting: Strict patient-level splits (train/val/test) are implemented to prevent data leakage.

Preprocessing Pipelines

  • CT Imaging:
    • Intensity values converted to Hounsfield Units (HU).
    • Three standard clinical windows (Lung, PE, and Mediastinum) are applied and stacked into 3-channel images.
    • Slices are resized to 256×256 and center-cropped to 224×224.
  • Radiology Reports:
    • Text is segmented into sentences using a custom clinical-aware algorithm.
    • Tokenization is performed using the Clinical-Longformer tokenizer.
  • Structured EHR:
    • Data represented as a sparse count matrix of clinical codes (ICD-10, Labs, Medications) prior to the scan date.
    • Dimensionality is reduced using Truncated SVD (TSVD) to 128 dimensions or via the task-specific Supervised Autoencoder.

🔢 Results

Our experiments evaluate the prognostic performance across three time horizons: 1-month, 6-month, and 12-month mortality. The project compares unimodal baselines against various fusion strategies.

1. Performance Summary

The table below illustrates the predictive performance (Mean ± SD) across 5-fold cross-validation.

The following table compares the performance of our best unimodal baselines against different multimodal fusion architectures.

Categoria Modello / Configurazione 1-Mese 6-Mesi 12-Mesi
Unimodal Best Unimodal 0.269 ± .013 0.367 ± .072 0.454 ± .023
Early Fusion EHR-AE, Report 0.296 ± .014 0.371 ± .018 0.407 ± .009
EHR-AE, Image 0.302 ± .011 0.380 ± .016 0.405 ± .010
EHR-TSVD, Report 0.280 ± .027 0.392 ± .006 0.419 ± .010
EHR-TSVD, Image 0.274 ± .013 0.376 ± .011 0.390 ± .017
Report, Image 0.244 ± .024 0.350 ± .016 0.366 ± .012
EHR-AE, Report, Image 0.302 ± .008 0.378 ± .014 0.398 ± .007
EHR-TSVD, Report, Image 0.293 ± .009 0.393 ± .008 0.426 ± .008
Late Fusion Reports, EHR-AE 0.286 ± .015 0.388 ± .019 0.424 ± .005
Reports, EHR-GBM 0.399 ± .050 0.472 ± .011 0.494 ± .008
Image, EHR-AE 0.297 ± .017 0.395 ± .016 0.426 ± .004
Image, EHR-GBM 0.374 ± .025 0.467 ± .031 0.497 ± .011
Reports, Image 0.275 ± .015 0.375 ± .011 0.398 ± .012
Reports, Image, EHR-AE 0.331 ± .014 0.409 ± .014 0.444 ± .008
Reports, Image, EHR-GBM 0.362 ± .016 0.479 ± .011 0.488 ± .002
Cross Fusion EHR-TSVD → Image → Report 0.240 ± .023 0.321 ± .017 0.351 ± .028
EHR-AE → Image → Report 0.227 ± .011 0.327 ± .019 0.404 ± .022
EHR-TSVD → Report 0.247 ± .021 0.350 ± .028 0.377 ± .035
Image → Report → EHR-AE 0.305 ± .007 0.372 ± .023 0.397 ± .010
Armour Fusion EHR-TSVD, Image, Report 0.239 ± .027 0.360 ± .030 0.391 ± .024
EHR-AE, Image, Report 0.284 ± .023 0.372 ± .015 0.420 ± .012

2. Key Insights

  • Multimodal Advantage: Integrating radiology reports with structured EHR data consistently improves the Matthews Correlation Coefficient (MCC), especially in long-term prognosis (12 months).
  • Fusion Impact: Late fusion strategies (averaging predictions) often yield more stable results compared to early concatenation in high-dimensional sparse EHR settings.

🎓 Citation

If you use this code, please cite our work:

@article{paolomultimodal,
  title={Multimodal Clinical Data Integration for Prognosis of Pulmonary Embolism: A Comparative Study},
  author={Paolo, Domenico and Soda, Paolo and Tortora, Matteo and Bria, Alessandro and Sicilia, Rosa}
}

📜 License

This project is licensed. Please review the LICENSE file for more information.

About

This is a code implemention of the framework proposed in the paper "Multimodal Clinical Data Integration for Prognosis of Pulmonary Embolism: A Comparative Study".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors