CMIP-LD

CMIP Linked Data Utilities Library

Overview

CMIP-LD is a Python library for working with CMIP (Coupled Model Intercomparison Project) Linked Data vocabularies. It provides tools to fetch, resolve, validate, and generate documentation for JSON-LD controlled vocabularies used across CMIP and related climate science projects.

Key Features

🔗 Prefix Resolution - Resolve short prefixes (e.g., universal:frequency) to full URLs
📥 Data Fetching - Retrieve and expand JSON-LD documents with automatic dereferencing
📝 Documentation Generation - Auto-generate README files for vocabulary directories
✅ Validation - Validate JSON files against schemas and contexts
🔄 CI/CD Actions - GitHub Actions for automated vocabulary processing

Supported Vocabularies

Prefix	Repository	Description
`universal`	WCRP-universe	Universal controlled vocabularies
`cmip7`	CMIP7-CVs	CMIP7 controlled vocabularies
`cmip6plus`	CMIP6Plus_CVs	CMIP6Plus controlled vocabularies
`cf`	CF	CF Conventions vocabularies
`vr`	Variable-Registry	Variable registry
`emd`	Essential-Model-Documentation	Essential model documentation

Installation

Using pip (editable mode for development)

git clone https://github.com/wcrp-cmip/CMIP-LD.git
cd CMIP-LD
pip install -e .

Dependencies

Python 3.8+
jsonld-recursive - JSON-LD processing
requests - HTTP requests
Optional: esgvoc - For Pydantic model integration

Quick Start

Fetching Data

import cmipld

# Fetch and resolve a vocabulary term
data = cmipld.get("universal:frequency/mon")
print(data)

# Expand a JSON-LD document
expanded = cmipld.expand("universal:frequency")

Resolving Prefixes

import cmipld

# Get the full URL for a prefix
url = cmipld.mapping['universal']
# → 'https://wcrp-cmip.github.io/WCRP-universe/'

# Resolve a prefixed URI
full_url = cmipld.resolve_prefix("universal:frequency/mon")

Using with esgvoc

from esgvoc.api import search

# Search for terms
results = search.find("frequency", term="mon")
print(results)

Repository Structure

CMIP-LD/
├── cmipld/                    # Main Python package
│   ├── __init__.py            # Package initialization & client setup
│   ├── locations.py           # Prefix mappings and URL resolution
│   ├── prefix_mappings.json   # Prefix → repository mappings
│   ├── generate/              # Documentation generation tools
│   │   ├── create_readme.py   # Generate READMEs for vocab directories
│   │   ├── generate_summary.py
│   │   └── validate_json.py
│   └── utils/                 # Utility functions
│       ├── git/               # Git integration
│       ├── extract/           # Data extraction tools
│       └── ...
├── actions/                   # GitHub Actions for CI/CD
├── static/                    # Static assets (viewer, images)
├── notebooks/                 # Example Jupyter notebooks
└── scripts/                   # Standalone utility scripts

Documentation Generation

Generate READMEs for Vocabulary Directories

The create_readme.py script generates standardized documentation for vocabulary directories containing JSON-LD files:

python -m cmipld.generate.create_readme /path/to/src-data/universe

Features:

Only processes directories with a _context file
Extracts schema from Pydantic models (via esgvoc) or JSON keys
Generates usage examples for cmipld, esgvoc, and direct HTTP
Creates collapsible file listings
Analyzes external dependencies

Collect READMEs for MkDocs

python scripts/collect_vocab_docs.py /path/to/src-data --output docs/vocabularies

This collects all vocabulary READMEs into a single folder for rendering with MkDocs.

GitHub Actions

CMIP-LD provides reusable GitHub Actions for vocabulary repositories:

Action	Description
`actions/process_jsonld`	Process and validate JSON-LD files
`actions/build-mkdocs`	Build MkDocs documentation
`actions/check-graph`	Validate graph structure
`actions/commit-all`	Commit changes with attribution

Contributing

See CONTRIBUTING.md for guidelines.

Related Projects

esgvoc - ESGF Vocabulary API with Pydantic models
jsonld-recursive - JSON-LD recursive resolution
WCRP-universe - Universal vocabularies

License

Apache 2.0 - See LICENSE for details.

Developed by WCRP-CMIP for the climate science community.

Name		Name	Last commit message	Last commit date
Latest commit History 435 Commits
notebooks		notebooks
.github		.github
actions		actions
cmipld		cmipld
copier		copier
notebooks		notebooks
static		static
.gitignore		.gitignore
.nojekyll		.nojekyll
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMIP-LD

Overview

Key Features

Supported Vocabularies

Installation

Using pip (editable mode for development)

Dependencies

Quick Start

Fetching Data

Resolving Prefixes

Using with esgvoc

Repository Structure

Documentation Generation

Generate READMEs for Vocabulary Directories

Collect READMEs for MkDocs

GitHub Actions

Contributing

Related Projects

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

WCRP-CMIP/CMIPLD

Folders and files

Latest commit

History

Repository files navigation

CMIP-LD

Overview

Key Features

Supported Vocabularies

Installation

Using pip (editable mode for development)

Dependencies

Quick Start

Fetching Data

Resolving Prefixes

Using with esgvoc

Repository Structure

Documentation Generation

Generate READMEs for Vocabulary Directories

Collect READMEs for MkDocs

GitHub Actions

Contributing

Related Projects

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages