Paper: Development and Validation of the Intensive Documentation Index for ICU Mortality Prediction
Journal: Journal of the American Medical Informatics Association (JAMIA), 2026
Authors: Alexis M. Collier, DHA, MHA¹ · Sophia Z. Shalhout, PhD²³
Affiliations:
¹ College of Health & Wellness, University of North Georgia, Dahlonega, GA, USA
² Department of Otolaryngology–Head and Neck Surgery, Harvard Medical School, Boston, MA, USA
³ Mass Eye and Ear, Mass General Brigham, Boston, MA, USA
Companion paper (multinational validation): IDI-Multinational-Validation (npj Digital Medicine, 2026)
This repository contains the full analysis code for the Intensive Documentation Index (IDI) — a zero-burden prognostic framework that extracts temporal documentation rhythm features from nursing chartevents timestamps in the first 24 hours of ICU admission to predict in-hospital mortality.
Applied to 26,153 heart failure ICU admissions from MIMIC-IV (2008–2019), the IDI modestly but reliably improves mortality prediction beyond traditional clinical variables.
| Model | AUROC (95% CI) | Calibration Slope | Brier Score |
|---|---|---|---|
| Baseline (age, sex, ICU LOS) | 0.658 (0.609–0.710) | 0.92 | 0.1091 |
| IDI-Enhanced | 0.683 (0.631–0.732) | 0.96 | 0.1080 |
ΔAUROC = +0.025 (p = 0.015, DeLong test)
Strongest predictor: idi_cv_interevent OR = 1.53 per SD (95% CI 1.35–1.74, p < 0.001)
Temporal stability: mean AUC 0.654 (SD 0.016) across leave-one-year-out cross-validation (2008–2019)
IDI-MIMIC-IV-Mortality/
│
├── README.md
├── LICENSE
├── requirements.txt
├── .gitignore
│
├── src/
│ ├── idi_features.py # IDI feature extraction from chartevents timestamps
│ ├── cohort_selection.py # Inclusion/exclusion criteria applied to MIMIC-IV
│ ├── model.py # Logistic regression training & temporal validation
│ ├── metrics.py # AUROC, calibration slope, Brier, DeLong test
│ └── equity_analysis.py # Subgroup AUC analysis by race/ethnicity
│
├── data/
│ └── README.md # Data access instructions (PhysioNet DUA required)
│
└── results/
├── figures/ # ROC curves, calibration plots, forest plots
└── tables/ # CSV versions of manuscript tables
Raw data are NOT included in this repository due to PhysioNet Data Use Agreement restrictions.
To reproduce this analysis:
- Apply for credentialed access at PhysioNet
- Download MIMIC-IV version 2.2: https://physionet.org/content/mimiciv/2.2/
- Place the following files in
data/mimic-iv/:hosp/admissions.csvhosp/patients.csvhosp/diagnoses_icd.csvicu/icustays.csvicu/chartevents.csv
git clone https://github.com/colla00/IDI-MIMIC-IV-Mortality.git
cd IDI-MIMIC-IV-Mortality
pip install -r requirements.txtPython version: 3.8+
Run scripts in order:
# 1. Select cohort (outputs data/cohort.csv)
python src/cohort_selection.py
# 2. Extract IDI features (outputs data/idi_features.csv)
python src/idi_features.py
# 3. Train models and run temporal validation (outputs results/)
python src/model.py
# 4. Compute performance metrics
python src/metrics.py
# 5. Run equity analysis (outputs results/figures/equity_forest.png)
python src/equity_analysis.pyNine temporal features extracted from nursing chartevents in the first 24 ICU hours:
| Feature | Domain | Description |
|---|---|---|
idi_events_24h |
Volume | Total documentation events |
idi_events_per_hour |
Volume | Event rate per hour |
idi_max_gap_min |
Surveillance Gap | Maximum inter-event interval (min) |
idi_gap_count_60m |
Surveillance Gap | Intervals > 60 minutes |
idi_gap_count_120m |
Surveillance Gap | Intervals > 120 minutes |
idi_mean_interevent_min |
Rhythm | Mean inter-event interval (min) |
idi_std_interevent_min |
Rhythm | SD of inter-event intervals |
idi_cv_interevent |
Rhythm | Coefficient of variation (SD/mean) |
idi_burstiness |
Rhythm | Burstiness index B = (σ−μ)/(σ+μ) |
Features with absolute Pearson correlation > 0.30 with ICU length of stay were removed prior to modeling to prevent reverse-causal leakage (longer stay → more documentation → spurious mortality prediction). See Methods section of the manuscript for full details.
If you use this code, please cite:
@article{collier2026idi,
title = {Development and Validation of the Intensive Documentation Index for ICU Mortality Prediction},
author = {Collier, Alexis M. and Shalhout, Sophia Z.},
journal = {Journal of the American Medical Informatics Association},
year = {2026},
doi = {[to be assigned]}
}The IDI framework is the subject of U.S. provisional patent applications (Patent Pending):
- USPTO Application No. 63/976,293 — System and Method for Predicting ICU Mortality from Electronic Health Record Documentation Rhythm Patterns (filed February 2026)
- USPTO Application No. 63/946,187 — Clinical Decision Support System with Trust-Based Alert Prioritization and Equity Monitoring (filed December 2025)
VitaSignal LLC is the intended assignee. Licensing inquiries: info@vitasignal.ai
This research was, in part, funded by the National Institutes of Health (NIH) Agreement No. 1OT2OD032581 through the AIM-AHEAD program.
MIT License — see LICENSE file for details.