This project is designed to automate the extraction of metadata from GitHub and GitLab repositories to generate a Machine-Actionable Software Management Plan (SMP). It consists of two main components:
- Backend - A FastAPI-based service built with Clean Architecture principles that extracts metadata from GitHub, GitLab, and external sources (OpenAlex, Wayback Machine). The backend follows a layered architecture with clear separation of concerns: domain logic, use cases, adapters, and API endpoints.
- Frontend - A modern Nuxt 3 application with Vue 3 and TypeScript, providing an intuitive user interface to interact with the metadata extraction service. Features include platform selection, repository input, and comprehensive metadata visualization.
Follow the instructions in the Backend README to install and run the backend server.
Follow the steps in the Frontend README to set up and start the frontend application.
To simplify deployment, you can use Docker and Docker Compose to run the entire project (Nuxt frontend + FastAPI backend).
Ensure you have Docker installed on your system. You can download and install it from Docker’s official website.
From the project root, build and start both services:
docker compose up --buildThis will:
- Build and start the backend (FastAPI) and frontend (Nuxt) containers.
- Set up networking so the frontend can call the backend API.
Once the containers are running:
- Frontend UI: http://localhost:3000
- Backend API: http://localhost:8001
- API docs (Swagger): http://localhost:8001/docs
To stop the running containers:
docker compose downThis shuts down and removes the containers but keeps the built images.
The backend is also available as a Python package (comet-rs) that can be used as a CLI tool or imported as a library in your Python code.
pip install comet-rsPython 3.10+ is required.
Extract full metadata from a repository:
comet-rs extract https://github.com/owner/repo maSMP --with-enrichmentExtract a single property with source and confidence:
comet-rs extract_property https://github.com/owner/repo authorimport os
from app.api.services.metadata_service import run_extraction
# Extract full metadata
jsonld_document, enriched = run_extraction(
repo_url="https://github.com/owner/repo",
schema="maSMP", # or "CODEMETA"
access_token=os.getenv("GITHUB_TOKEN"),
with_enrichment=True,
)
# jsonld_document: maSMP/CODEMETA JSON-LD (dict)
# enriched: per-property source/confidence/categoryFor heavier use or private repositories, set environment variables:
export GITHUB_TOKEN=ghp_... # for GitHub repositories
export GITLAB_TOKEN=glpat_... # for GitLab repositoriesFor more details, see the PyPI README.
If you want to contribute to this project, feel free to fork the repository, create a new branch, and submit a pull request with your changes.
This project is licensed under the MIT License.
