Fix OWL generation to include nested schema modules#52
Conversation
The merge script used glob('*.yaml') which only found schemas one
level deep within domain directories, missing nested schemas like
energy/nuclear/*.yaml and medical/pediatric_oncology/*.yaml (18
schemas total). Changed to rglob('*.yaml') to recurse.
Regenerated OWL now includes all Genesis enums (AquiferType, etc.)
and other previously missing nested schemas.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Claude finished @cmungall's task —— View job PR Review Complete ✅I've thoroughly reviewed PR #52: "Fix OWL generation to include nested schema modules". Here's my comprehensive feedback: Todo List:
Code Quality & Best Practices ✅The Fix is Excellent: # Before: Only searched one level deep
for yaml_file in domain_dir.glob('*.yaml'):
# After: Recursively searches all subdirectories
for yaml_file in domain_dir.rglob('*.yaml'):Code Quality Assessment:
Impact Analysis 📊Before vs After Results:
Verified Missing Schemas Found:
Performance Considerations ✅Performance Impact: Minimal
Security Considerations ✅No Security Concerns:
Test Coverage Gaps
|
There was a problem hiding this comment.
Pull request overview
This PR fixes incomplete OWL/value set enum generation by ensuring the merge-hierarchy script discovers YAML schema modules nested under domain subdirectories (e.g., energy/nuclear/*, medical/pediatric_oncology/**), rather than only scanning one level deep.
Changes:
- Update schema discovery in
collect_all_schemas()fromglob('*.yaml')to recursiverglob('*.yaml')for domain directories.
Summary
The merge-hierarchy script (
scripts/merge_enums_hierarchy.py) usedglob('*.yaml')to find schemas within domain directories, which only searches one level deep. This missed 18 schemas in nested subdirectories:energy/nuclear/*.yaml(fusion, nuclear_cleanup, nuclear_forensics, etc.)energy/renewable/*.yaml(hydrogen, geothermal, bioenergy)medical/oncology/*.yaml,medical/pediatric_oncology/**/*.yamlChanged to
rglob('*.yaml')to recurse into subdirectories.Before: 675 concrete enums in merged hierarchy
After: 693 concrete enums
The regenerated OWL now includes all Genesis enums (AquiferType, FusionConfinementType, etc.) that were previously missing from
https://w3id.org/valuesets/valuesets.owl.ttl.Test plan
just merge-hierarchynow finds 693 concrete enums (was 675)grep AquiferType project/owl/valuesets.owl.ttlreturns resultscurl -L -s https://w3id.org/valuesets/valuesets.owl.ttl | grep AquiferTypereturns content🤖 Generated with Claude Code