Summary
Create the calibration_controls database table and CalibrationControl SQLAlchemy ORM model, add disease and controls_not_phi fields to score_calibrations, and write the Alembic migration covering all these changes.
Background
Calibrations are derived from empirical controls — variants in the score set with independently known clinical significance. Currently this relationship is not stored in MaveDB, preventing clinicians from auditing the basis of a calibration. This issue creates the storage layer for controls.
Two new fields on ScoreCalibration are part of this schema change:
disease — optional free-text disease/disorder label providing clinical context at the calibration level
controls_not_phi — boolean storing the submitter's affirmative assertion that control data is not PHI
Proposed Behavior
New table: calibration_controls
| Column |
Type |
Notes |
id |
INTEGER |
PRIMARY KEY |
calibration_id |
INTEGER |
FK → score_calibrations.id ON DELETE CASCADE, NOT NULL |
variant_id |
INTEGER |
FK → variants.id, NOT NULL |
clinical_status |
ENUM |
CalibrationControlStatus (pathogenic | benign), NOT NULL |
created_by_id |
INTEGER |
FK → users.id, NOT NULL |
modified_by_id |
INTEGER |
FK → users.id, NOT NULL |
creation_date |
DATE |
NOT NULL, default today |
modification_date |
DATE |
NOT NULL, default today, onupdate today |
UNIQUE constraint on (calibration_id, variant_id) — no duplicate controls within a calibration.
New CalibrationControl ORM model
- Located at
src/mavedb/models/calibration_control.py
- Relationships:
calibration: Mapped[ScoreCalibration] — back-populates ScoreCalibration.controls
variant: Mapped[Variant]
created_by: Mapped[User]
modified_by: Mapped[User]
Changes to ScoreCalibration
- Add
disease = Column(String, nullable=True) — optional disease/disorder label
- Add
controls_not_phi = Column(Boolean, nullable=True) — None = not yet acknowledged; True = confirmed not PHI
- Add
controls: Mapped[list[CalibrationControl]] relationship with cascade="all, delete-orphan"
Alembic migration
Single migration file covering all three changes above.
Acceptance Criteria
Implementation Notes
- Follow the pattern of
ScoreCalibrationFunctionalClassification for the ORM model structure
controls_not_phi uses nullable=True (three-state: None = not set, False = explicitly declined, True = confirmed) — the API gates publishing on True when controls are present (see PHI gate issue)
disease is intentionally free text with no controlled vocabulary — optional, provides human-readable context only
- Do not add a back-reference from
Variant to CalibrationControl, consistent with the intentional omission noted in models/variant.py
Clarification: Distinction from ScoreCalibrationFunctionalClassification.variants
ScoreCalibrationFunctionalClassification already has a many-to-many with Variant (via score_calibration_functional_classification_variants). That represents bin membership — "this variant's score places it in this functional range."
CalibrationControl is a distinct concept: ground truth controls — "this variant has independently known clinical significance and was used as empirical evidence when deriving the calibration thresholds."
A control variant will often also appear in a bin (its score should land in the corresponding range), but the two relationships serve different purposes and should remain separate. Add docstrings to both CalibrationControl and ScoreCalibrationFunctionalClassification clarifying this distinction for future contributors.
Summary
Create the
calibration_controlsdatabase table andCalibrationControlSQLAlchemy ORM model, adddiseaseandcontrols_not_phifields toscore_calibrations, and write the Alembic migration covering all these changes.Background
Calibrations are derived from empirical controls — variants in the score set with independently known clinical significance. Currently this relationship is not stored in MaveDB, preventing clinicians from auditing the basis of a calibration. This issue creates the storage layer for controls.
Two new fields on
ScoreCalibrationare part of this schema change:disease— optional free-text disease/disorder label providing clinical context at the calibration levelcontrols_not_phi— boolean storing the submitter's affirmative assertion that control data is not PHIProposed Behavior
New table:
calibration_controlsidcalibration_idscore_calibrations.idON DELETE CASCADE, NOT NULLvariant_idvariants.id, NOT NULLclinical_statusCalibrationControlStatus(pathogenic|benign), NOT NULLcreated_by_idusers.id, NOT NULLmodified_by_idusers.id, NOT NULLcreation_datemodification_dateUNIQUE constraint on
(calibration_id, variant_id)— no duplicate controls within a calibration.New
CalibrationControlORM modelsrc/mavedb/models/calibration_control.pycalibration: Mapped[ScoreCalibration]— back-populatesScoreCalibration.controlsvariant: Mapped[Variant]created_by: Mapped[User]modified_by: Mapped[User]Changes to
ScoreCalibrationdisease = Column(String, nullable=True)— optional disease/disorder labelcontrols_not_phi = Column(Boolean, nullable=True)—None= not yet acknowledged;True= confirmed not PHIcontrols: Mapped[list[CalibrationControl]]relationship withcascade="all, delete-orphan"Alembic migration
Single migration file covering all three changes above.
Acceptance Criteria
calibration_controlstable is created with the described columns and constraintsCalibrationControlORM model is importable frommavedb.models.calibration_controlScoreCalibrationhasdisease,controls_not_phi, andcontrolsfields(calibration_id, variant_id)is present in both ORM and migrationcalibration_idFK is implementedCalibrationControlStatusenum (see enum issue) is used forclinical_statusImplementation Notes
ScoreCalibrationFunctionalClassificationfor the ORM model structurecontrols_not_phiusesnullable=True(three-state:None= not set,False= explicitly declined,True= confirmed) — the API gates publishing onTruewhen controls are present (see PHI gate issue)diseaseis intentionally free text with no controlled vocabulary — optional, provides human-readable context onlyVarianttoCalibrationControl, consistent with the intentional omission noted inmodels/variant.pyClarification: Distinction from ScoreCalibrationFunctionalClassification.variants
ScoreCalibrationFunctionalClassificationalready has a many-to-many withVariant(viascore_calibration_functional_classification_variants). That represents bin membership — "this variant's score places it in this functional range."CalibrationControlis a distinct concept: ground truth controls — "this variant has independently known clinical significance and was used as empirical evidence when deriving the calibration thresholds."A control variant will often also appear in a bin (its score should land in the corresponding range), but the two relationships serve different purposes and should remain separate. Add docstrings to both
CalibrationControlandScoreCalibrationFunctionalClassificationclarifying this distinction for future contributors.