Skip to content

Implement JGI-style validation rules for OrganismSample ribosomal sequences #2893

@turbomam

Description

@turbomam

Context

PR #2884 adds ribosomal sequence slots from JGI Isolate v19 to OrganismSample. JGI enforces validation rules that are not yet implemented in the schema:

  • 16S sequence must be >1300 nt with <10% Ns for microbial drafts
  • ITS sequence must be >450 nt with <2% Ns for fungal drafts
  • Sequences must contain only A, C, G, T, or N characters with no header

Per @aclum's review comment, we need to sort out how to implement these validation rules in LinkML.

Options

  • LinkML pattern constraints on the slot
  • Custom validation in the submission portal
  • Post-validation via a separate tool

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions