Skip to content

Describe how versioning works for schemas #121

@cmungall

Description

@cmungall

There is often confusion about how to handle versioning in schemas. The. copier should encourage the de facto standard mechanism exemplified by Biolink, see below.

The mechanism could be mentioned in both the parent README and also in the template CONTRIBUTING.md:

https://github.com/linkml/linkml-project-copier/blob/main/template/CONTRIBUTING.md.jinja

How versioning works for NMDC and Biolink

Short version: NMDC and Biolink both treat the git tag as the “real” version, keep pyproject.toml at 0.0.0, and let Poetry + poetry-dynamic-versioning plus a GitHub Action build & push to PyPI when a GitHub Release is created.

I’ll break it down by:

  1. What NMDC does
  2. What Biolink does
  3. What goes in pyproject.toml for a LinkML schema
  4. A minimal GitHub Actions workflow you can copy

1. NMDC (nmdc-schema) versioning

pyproject.toml

NMDC uses Poetry and configures poetry-dynamic-versioning, but with an extra twist: they also substitute the version into the schema YAML from git. In nmdc-schema’s pyproject.toml you see: ([GitHub]1)

[tool.poetry]
name = "nmdc_schema"
version = "0.0.0"
description = "Schema resources for the National Microbiome Data Collaborative (NMDC)"
license = "MIT"
repository = "https://github.com/microbiomedata/nmdc-schema"
documentation = "https://microbiomedata.github.io/nmdc-schema/"
# ...authors, classifiers, packages, dependencies...

[tool.poetry.dependencies]
python = "^3.10"
linkml = "^1.9.6"
linkml-runtime = "^1.9.5"
# ...etc...

[tool.poetry-dynamic-versioning]
# They currently have this disabled because they run it explicitly:
enable = false
vcs = "git"
style = "pep440"

[tool.poetry-dynamic-versioning.substitution]
files = ["src/schema/nmdc.yaml"]
patterns = [
  "(^\\s*__version__\\s*(?::.*?)?=\\s*['\"]) [^'\"]* (['\"])",
  "(^version:\\s*['\"]?) [^'\"]*? (['\"]?)$"
]

Key points:

  • version = "0.0.0" in [tool.poetry] is a dummy; the real version comes from git tags like v11.13.0.

  • poetry-dynamic-versioning is configured to rewrite both:

    • any __version__ = "..." style strings, and
    • a version: field in the schema YAML
      so schema + package metadata share the same version. ([GitHub]1)
  • They call poetry dynamic-versioning before poetry build as part of their release pipeline, so all generated artifacts (JSON Schema, docs, etc.) have the correct version baked in.

GitHub Action → PyPI

From their maintainer docs and Sigstore records: ([GitHub]2)

  • They have a .github/workflows/pypi-publish.yaml workflow.
  • It is triggered on GitHub Releases (on: release) with tags of the form vX.Y.Z.
  • The workflow builds the Poetry distribution and uses pypa/gh-action-pypi-publish with OIDC/Trusted Publisher (no API key, id-token: write permissions) to upload to PyPI.

So the technical flow is:

  1. Bump version by creating tag v11.13.0 (usually via GitHub Release UI).

  2. GitHub Action runs:

    • Checks out repo
    • Runs poetry dynamic-versioning to stamp version into nmdc.yaml and Python package
    • Runs poetry build
    • Runs pypa/gh-action-pypi-publish → publishes nmdc-schema==11.13.0 on PyPI (PyPI)

2. Biolink (biolink-model) versioning

pyproject.toml

Biolink also uses Poetry + dynamic versioning, but they let the plugin run automatically rather than manually: ([GitHub]4)

[tool.poetry]
name = "biolink-model"
version = "0.0.0"
description = "The Biolink Model is a high-level data model..."
readme = "README.md"
# ...

[tool.poetry.dependencies]
python = "^3.9"
linkml = "1.9.4"
linkml-runtime = "1.9.5"
# ...

[tool.poetry-dynamic-versioning]
enable = true
vcs = "git"
style = "pep440"

[build-system]
requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning"]
build-backend = "poetry_dynamic_versioning.backend"

Again:

  • version = "0.0.0" is a placeholder;
  • the real version comes from git tags like v4.3.6, which matches their GitHub Releases and PyPI versions. ([GitHub]5)

GitHub Action → PyPI

Their release notes show they also use pypa/gh-action-pypi-publish in .github/workflows and periodically bump that action. ([GitHub]5)

Pattern is basically:

  1. on: release: types: [published]
  2. Install Python + Poetry (+ dynamic versioning plugin)
  3. poetry build (plugin auto-derives version from tag)
  4. pypa/gh-action-pypi-publish uploads to PyPI using OIDC / Trusted Publisher.

Their “Maintaining the Biolink Model” docs emphasize that consumers must pin to a particular Biolink release, and list all releases on GitHub. ([Biolink]6)


3. What you usually put in pyproject.toml for a LinkML schema

If you’re modeling your project after NMDC / Biolink + the official LinkML cookiecutter, the core pieces in pyproject.toml are:

a. Build system & dynamic versioning

[build-system]
requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning"]
build-backend = "poetry_dynamic_versioning.backend"

b. Project metadata (dummy version)

[tool.poetry]
name = "my-schema"
version = "0.0.0"  # leave this; version comes from git tag
description = "Schema for my awesome thing"
readme = "README.md"
license = "MIT"
repository = "https://github.com/me/my-schema"
documentation = "https://me.github.io/my-schema/"
authors = ["You <you@example.org>"]
keywords = ["linkml", "schema", "whatever"]
# If you have a src-layout, you can also declare packages:
packages = [
  { include = "my_schema" }
]

c. Dependencies

At minimum:

[tool.poetry.dependencies]
python = "^3.10"
linkml = "^1.9.0"
linkml-runtime = "^1.9.0"
pyyaml = "^6.0"  # often used

[tool.poetry.group.dev.dependencies]
pytest = "^8.0.0"
mkdocs = "^1.4.2"
mkdocs-material = "^9.0.0"
schemasheets = "^0.2.0"
ruff = "^0.4.0"
black = ">=23.1,<25.0"
# etc., as you like

d. Dynamic versioning config

Simplest (Biolink-style) config:

[tool.poetry-dynamic-versioning]
enable = true
vcs = "git"
style = "pep440"

If you want the version to also appear inside your schema YAML (NMDC-style):

[tool.poetry-dynamic-versioning]
enable = true
vcs = "git"
style = "pep440"

[tool.poetry-dynamic-versioning.substitution]
files = ["src/schema/my_schema.yaml"]
patterns = [
  # match something like: __version__ = "0.0.0"
  "(^\\s*__version__\\s*(?::.*?)?=\\s*['\"]) [^'\"]* (['\"])",
  # match a yaml line like: version: 0.0.0
  "(^version:\\s*['\"]?) [^'\"]*? (['\"]?)$"
]

So the life cycle becomes:

  • You tag v1.2.3.

  • At build time, plugin sees v1.2.3, strips v, writes 1.2.3 into:

    • package version metadata
    • version: in your schema YAML
  • All artifacts + PyPI package align on 1.2.3.

This is exactly the pattern described in the LinkML “Manage Releases” doc: keep 0.0.0 in pyproject.toml, use tags vX.Y.Z, let GitHub Actions + plugin handle the rest. (LinkML)


4. GitHub Action “magic” to stamp & push to PyPI

Here’s a minimal workflow that matches what NMDC/Biolink + LinkML docs are doing, using Trusted Publisher and pypa/gh-action-pypi-publish:

# .github/workflows/pypi-publish.yml
name: Publish to PyPI

on:
  release:
    types: [published]

permissions:
  contents: read
  id-token: write   # required for PyPI Trusted Publisher

jobs:
  build-and-publish:
    runs-on: ubuntu-latest

    steps:
      - name: Check out code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install Poetry + dynamic versioning plugin
        run: |
          pip install "poetry>=1.7" poetry-dynamic-versioning
          poetry self add "poetry-dynamic-versioning[plugin]"

      - name: Install dependencies
        run: poetry install --only main

      - name: Build distributions
        run: poetry build

      - name: Publish to PyPI (Trusted Publisher)
        uses: pypa/gh-action-pypi-publish@v1.13.0
        # No API token needed if you've set up Trusted Publisher on PyPI

How this ties into versioning:

  1. You create a GitHub Release named/tagged v1.2.3.
  2. Action runs; Poetry’s dynamic versioning plugin reads the tag and injects 1.2.3 into your package (and optionally your schema YAML via substitution patterns).
  3. pypa/gh-action-pypi-publish logs in to PyPI via OIDC (Trusted Publisher) and uploads my-schema 1.2.3. (GitHub)

If you’d like, I can:

  • Sketch a tiny LinkML schema repo layout (src/schema/*.yaml, project/Makefile, docs), wired to this versioning setup, or
  • Help you retrofit this onto an existing LinkML schema you already have.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions