-
Notifications
You must be signed in to change notification settings - Fork 9
Description
I asked claude.ai to review the PyRenew code base and the elements of PyRenew currently being used by pyrenew-hew, with an eye towards adding the observation process, latent infections process, and model builder components which have been developed in repo https://github.com/cdcent/cfa-pyrenew-hierarchical (which is really just a staging repo for things to be added to PyRenew). This is the response to my query.
PyRenew Redesign Proposal
Executive Summary
Based on analysis of PyRenew, cfa-pyrenew-hierarchical, and pyrenew-hew, the core value of PyRenew lies in:
- RandomVariable metaclass - foundational abstraction
- convolve.py utilities - essential for renewal math
- Transformation system - clean parameter constraints
- Time utilities - date/MMWR week handling
Much of the current API is either unused, superseded by newer patterns, or overly specialized.
Part 1: Modules & Classes to Simplify or Remove
Tier 1: DEPRECATE (superseded by new observation/latent patterns)
| Module | Class | Reason |
|---|---|---|
| pyrenew/model/ | RtInfectionsRenewalModel | Monolithic; replaced by composable latent + observation pattern |
| pyrenew/model/ | HospitalAdmissionsModel | Specialized model that should be a composition of generic components |
| pyrenew/latent/ | HospitalAdmissions | Should be a Counts observation process, not a latent component |
| pyrenew/observation/ | PoissonObservation | Replaced by Counts + PoissonNoise |
| pyrenew/observation/ | NegativeBinomialObservation | Replaced by Counts + NegativeBinomialNoise |
Tier 2: SIMPLIFY (overly complex or redundant)
| Module | Class | Issue |
|---|---|---|
| pyrenew/process/ | RtPeriodicDiffARProcess | Too specialized; should be composed from simpler pieces |
| pyrenew/process/ | RtWeeklyDiffARProcess | Convenience wrapper that hides composition pattern |
| pyrenew/latent/ | InfectionInitializationProcess + 3 methods | Complex abstraction; cfa-pyrenew-hierarchical uses simpler prevalence-based init |
| pyrenew/deterministic/ | DeterministicPMF | Keep but consider merging with DeterministicVariable |
Tier 3: KEEP (core value)
| Module | Class/Function | Why Essential |
|---|---|---|
| pyrenew/metaclass.py | RandomVariable | Foundation of entire system |
| pyrenew/metaclass.py | Model | MCMC integration layer |
| pyrenew/convolve.py | new_convolve_scanner, compute_delay_ascertained_incidence | Core renewal math |
| pyrenew/randomvariable/ | DistributionalVariable, TransformedVariable | Heavily used wrappers |
| pyrenew/randomvariable/ | Hierarchical priors (HierarchicalNormalPrior, etc.) | New, well-designed |
| pyrenew/process/ | ARProcess, DifferencedProcess, IIDRandomSequence | Building blocks |
| pyrenew/process/ | RandomWalk, PeriodicEffect | Common patterns |
| pyrenew/latent/ | Infections, InfectionsWithFeedback | Core renewal equation |
| pyrenew/observation/ | BaseObservationProcess, Counts, CountsBySite, Measurements | New generic framework |
| pyrenew/observation/ | Noise models (PoissonNoise, NegativeBinomialNoise, HierarchicalNormalNoise) | Composable noise |
| pyrenew/transformation/ | All | Clean, useful |
| pyrenew/time.py | All | Essential utilities |
| pyrenew/distutil.py | PMF validation | Essential utilities |
| pyrenew/datasets/ | Reference distributions | Useful defaults |
Part 2: Staged Redesign Plan
Stage 1: Consolidate Observation Framework
Goal: Make the new generic observation pattern the primary API
- Mark PoissonObservation and NegativeBinomialObservation as deprecated
- Update all examples to use Counts + noise model pattern
- Move HospitalAdmissions out of latent/ (it's an observation, not latent)
- Ensure BaseObservationProcess has clear documentation
Stage 2: Integrate Latent Infection Processes from cfa-pyrenew-hierarchical
Goal: Add hierarchical and partitioned infection patterns
- Add HierarchicalInfections - multi-subpopulation renewal with Rt deviations
- Add PartitionedInfections - single renewal with allocation to subpopulations
- Add protocol-based TemporalProcess for Rt dynamics (RandomWalk, AR1, DifferencedAR1)
- Standardize four-tuple output: (infections_juris, infections_all, infections_obs, infections_unobs)
Stage 3: Add Model Composition Layer
Goal: Make multi-signal model building easy
- Add ModelBuilder that:
- Automatically computes n_initialization_points from all components
- Routes infections to observations based on infection_resolution()
- Validates component compatibility at build time - Add MultiSignalModel that orchestrates latent + multiple observations
Stage 4: Deprecate Monolithic Models
Goal: Remove pre-composed models that hide composition
- Deprecate RtInfectionsRenewalModel with migration guide
- Deprecate HospitalAdmissionsModel with migration guide
- Keep for 1-2 versions with deprecation warnings
Stage 5: Simplify Process Classes
Goal: Remove over-specialized Rt processes
- Deprecate RtPeriodicDiffARProcess and RtWeeklyDiffARProcess
- Show composition pattern: DifferencedProcess(ARProcess(...)) + PeriodicEffect(...)
- Simplify infection initialization to prevalence-based approach
Part 3: Tutorial Recommendations
DROP (no longer relevant or duplicated)
| Tutorial | Reason |
|---|---|
| extending_pyrenew.md | Duplicates custom_randomvariables.md |
| extending_pyrenew-gfm.md | Same content, different format |
KEEP & UPDATE
| Tutorial | Updates Needed |
|---|---|
| getting_started.md | Good foundation; add forward reference to generic observation framework |
| basic_renewal_model.md | Update to show composition pattern instead of RtInfectionsRenewalModel |
| custom_randomvariables.md | Good; rename to "Extending PyRenew" after dropping duplicate |
| observation_processes_counts.md | Excellent; already aligned with new patterns |
| observation_processes_measurements.md | Excellent; shows Wastewater subclass pattern |
| day_of_the_week.md | Keep; practical feature |
| periodic_effects.md | Keep; useful for seasonality |
UPDATE SIGNIFICANTLY
| Tutorial | Changes |
|---|---|
| hospital_admissions_model.md | Rewrite to use Counts observation + generic latent, not HospitalAdmissionsModel |
ADD (missing critical content)
| New Tutorial | Content |
|---|---|
| Multi-Signal Models | How to combine hospital + wastewater + ED in one model; use ModelBuilder pattern |
| Hierarchical Infections | Multi-subpopulation modeling with HierarchicalInfections; partial pooling |
| Semi-Observed Models | Wastewater covers part of population; unobserved subpopulation inference |
Part 4: Missing Documentation
README Gaps
- No clear "when to use PyRenew" - needs positioning vs. EpiNow2, epidemia, etc.
- No quick-start code example - just links to tutorials
- No architecture diagram - the mermaid chart is too specific (HospitalAdmissions only)
- No component catalog - hard to discover what's available
- Missing: installation troubleshooting for JAX/NumPyro issues
API Documentation Gaps
- No docstrings on many classes - especially newer observation processes
- No type hints in many places
- No usage examples in docstrings
- Inconsistent validate() documentation - when is it called? what should it check?
Conceptual Documentation Gaps
- No "how PyRenew thinks" guide - the layer model (latent → observation → composition)
- No glossary - terms like "ascertainment", "generation interval", "infection resolution"
- No decision tree - "which component should I use for X?"
- No performance guide - when to use scan vs vectorized, JAX tracing tips
Part 5: API Clarity Improvements
Interface Simplification
- Standardize return types: All latent processes should return named tuples, not raw arrays
- Standardize sample() signatures: Consistent parameter names across similar components
- Add repr methods: Make debugging easier
Error Messages
- Add validation at construction time, not just sample time
- Clear error messages when components are incompatible
- Warnings when using deprecated patterns
Part 6: Recommended Package Structure (Post-Refactor)
pyrenew/
├── core/
│ ├── metaclass.py # RandomVariable, Model
│ ├── convolve.py # Convolution utilities
│ └── math.py # Mathematical utilities
├── variables/
│ ├── distributional.py # DistributionalVariable
│ ├── transformed.py # TransformedVariable
│ ├── deterministic.py # DeterministicVariable, DeterministicPMF
│ └── hierarchical.py # HierarchicalNormalPrior, etc.
├── processes/
│ ├── temporal.py # TemporalProcess protocol, AR, RandomWalk
│ ├── periodic.py # PeriodicEffect, DayOfWeekEffect
│ └── differenced.py # DifferencedProcess
├── latent/
│ ├── infections.py # Infections, InfectionsWithFeedback
│ ├── hierarchical.py # HierarchicalInfections
│ └── partitioned.py # PartitionedInfections
├── observations/
│ ├── base.py # BaseObservationProcess
│ ├── counts.py # Counts, CountsBySite
│ ├── measurements.py # Measurements (base for continuous)
│ ├── wastewater.py # Wastewater-specific
│ └── noise.py # All noise models
├── model/
│ ├── builder.py # ModelBuilder
│ └── multisignal.py # MultiSignalModel
├── utils/
│ ├── time.py # Date/time utilities
│ ├── arrays.py # Array utilities
│ └── validation.py # PMF validation, etc.
├── transforms/ # Transformations
├── datasets/ # Reference distributions
└── deprecated/ # Old models with deprecation warnings
├── rtinfections.py # RtInfectionsRenewalModel
└── admissions.py # HospitalAdmissionsModel
Summary: Priority Actions
- Immediate: Consolidate duplicate tutorials, update README with architecture overview
- Short-term: Add deprecation warnings to old observation classes; document new Counts/Measurements as primary
- Medium-term: Integrate HierarchicalInfections and PartitionedInfections from cfa-pyrenew-hierarchical
- Medium-term: Add ModelBuilder for multi-signal model composition
- Long-term: Deprecate monolithic models; reorganize package structure