Context
PR #2884 adds ribosomal sequence slots from JGI Isolate v19 to OrganismSample. JGI enforces validation rules that are not yet implemented in the schema:
- 16S sequence must be >1300 nt with <10% Ns for microbial drafts
- ITS sequence must be >450 nt with <2% Ns for fungal drafts
- Sequences must contain only A, C, G, T, or N characters with no header
Per @aclum's review comment, we need to sort out how to implement these validation rules in LinkML.
Options
- LinkML
pattern constraints on the slot
- Custom validation in the submission portal
- Post-validation via a separate tool
Related