-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Feature request
Problem
Currently linkml-term-validator only accepts LinkML YAML schemas as input. When validating ontology term identifiers referenced in documentation (markdown, plain text, OBO files, etc.), users must write a shim script to:
- Extract CURIEs and labels from the text using regex
- Generate a synthetic LinkML schema YAML with enum permissible values
- Create an OAK config file mapping prefixes to adapters
- Run
validate-schemaon the generated YAML
This is clunky compared to the sister tool linkml-reference-validator, which already has a validate text-file subcommand with --regex support.
Proposed solution
Add a validate text-file subcommand analogous to the one in linkml-reference-validator:
uvx linkml-term-validator validate text-file document.md \
--regex '@term (\S+) "([^"]*)"' \
--curie-group 1 --label-group 2 \
--config oak_config.yaml --strict -vThis would:
- Read the text file
- Extract CURIE + label pairs using the regex
- Resolve each CURIE via OAK
- Check the label matches
- Report results
Use case
I'm writing analysis documents for ontology restructuring (e.g. MONDO disease term reviews) that reference many terms from multiple ontologies (MONDO, ORDO, etc.). I want to embed machine-checkable assertions directly in the markdown:
## Validated Identifiers
- @term MONDO:0009282 "multiple acyl-CoA dehydrogenase deficiency"
- @term ORDO:26791 "Multiple acyl-CoA dehydrogenase deficiency"And validate them with a single command rather than generating intermediate YAML files.
Additional issue: --strict should error on unresolvable CURIEs
Currently, if a CURIE doesn't exist in the ontology, the tool silently passes (no label retrieved = no mismatch). With --strict, an unresolvable CURIE should be an error, not a silent pass. This is important for catching typos in identifiers.