Skip to content

Add calibration controls to schema (ORM model, migration, disease/controls_not_phi on ScoreCalibration) #749

@bencap

Description

@bencap

Summary

Create the calibration_controls database table and CalibrationControl SQLAlchemy ORM model, add disease and controls_not_phi fields to score_calibrations, and write the Alembic migration covering all these changes.

Background

Calibrations are derived from empirical controls — variants in the score set with independently known clinical significance. Currently this relationship is not stored in MaveDB, preventing clinicians from auditing the basis of a calibration. This issue creates the storage layer for controls.

Two new fields on ScoreCalibration are part of this schema change:

  • disease — optional free-text disease/disorder label providing clinical context at the calibration level
  • controls_not_phi — boolean storing the submitter's affirmative assertion that control data is not PHI

Proposed Behavior

New table: calibration_controls

Column Type Notes
id INTEGER PRIMARY KEY
calibration_id INTEGER FK → score_calibrations.id ON DELETE CASCADE, NOT NULL
variant_id INTEGER FK → variants.id, NOT NULL
clinical_status ENUM CalibrationControlStatus (pathogenic | benign), NOT NULL
created_by_id INTEGER FK → users.id, NOT NULL
modified_by_id INTEGER FK → users.id, NOT NULL
creation_date DATE NOT NULL, default today
modification_date DATE NOT NULL, default today, onupdate today

UNIQUE constraint on (calibration_id, variant_id) — no duplicate controls within a calibration.

New CalibrationControl ORM model

  • Located at src/mavedb/models/calibration_control.py
  • Relationships:
    • calibration: Mapped[ScoreCalibration] — back-populates ScoreCalibration.controls
    • variant: Mapped[Variant]
    • created_by: Mapped[User]
    • modified_by: Mapped[User]

Changes to ScoreCalibration

  • Add disease = Column(String, nullable=True) — optional disease/disorder label
  • Add controls_not_phi = Column(Boolean, nullable=True)None = not yet acknowledged; True = confirmed not PHI
  • Add controls: Mapped[list[CalibrationControl]] relationship with cascade="all, delete-orphan"

Alembic migration

Single migration file covering all three changes above.

Acceptance Criteria

  • calibration_controls table is created with the described columns and constraints
  • CalibrationControl ORM model is importable from mavedb.models.calibration_control
  • ScoreCalibration has disease, controls_not_phi, and controls fields
  • UNIQUE constraint on (calibration_id, variant_id) is present in both ORM and migration
  • ON DELETE CASCADE for calibration_id FK is implemented
  • Alembic migration applies cleanly on a fresh DB and on an existing DB (both upgrade and downgrade tested)
  • CalibrationControlStatus enum (see enum issue) is used for clinical_status

Implementation Notes

  • Follow the pattern of ScoreCalibrationFunctionalClassification for the ORM model structure
  • controls_not_phi uses nullable=True (three-state: None = not set, False = explicitly declined, True = confirmed) — the API gates publishing on True when controls are present (see PHI gate issue)
  • disease is intentionally free text with no controlled vocabulary — optional, provides human-readable context only
  • Do not add a back-reference from Variant to CalibrationControl, consistent with the intentional omission noted in models/variant.py

Clarification: Distinction from ScoreCalibrationFunctionalClassification.variants

ScoreCalibrationFunctionalClassification already has a many-to-many with Variant (via score_calibration_functional_classification_variants). That represents bin membership — "this variant's score places it in this functional range."

CalibrationControl is a distinct concept: ground truth controls — "this variant has independently known clinical significance and was used as empirical evidence when deriving the calibration thresholds."

A control variant will often also appear in a bin (its score should land in the corresponding range), but the two relationships serve different purposes and should remain separate. Add docstrings to both CalibrationControl and ScoreCalibrationFunctionalClassification clarifying this distinction for future contributors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    app: backendTask implementation touches the backendapp: databaseTask implementation requires database changestype: featureNew featureworkstream: clinicalTask relates to clinical features

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions