Skip to content

man4ish/omnibioai-ecosystem

Repository files navigation

OmniBioAI Ecosystem

Reproducible Scientific Execution & AI-Powered Bioinformatics

OmniBioAI is a modular, AI-powered bioinformatics workbench designed to accelerate genomic research across:

  • Local machines
  • On-prem servers
  • HPC environments (Slurm, Apptainer)
  • Cloud infrastructure (AWS Batch, Azure Batch, Kubernetes)

With no mandatory cloud dependencies.

This repository is the workspace root of the OmniBioAI ecosystem — it assembles independently versioned components into a single runnable, production-grade stack.


Architecture

OmniBioAI follows a five-plane architecture:

Plane Role Key Components
Control Orchestration, governance, APIs Workbench, TES, ToolServer, Model Registry, LIMS
Security Zero-trust enforcement API Gateway, Auth, Policy Engine, HPC Policy, Security Audit
Compute Ephemeral execution Workflow runners, tool runtime containers, HPC adapters
Data Artifacts, outputs, versioning OmniObjects, model artifacts, workflow outputs
AI Reasoning, retrieval, agents RAG, Dev Hub, LLM integration, agent orchestration

TES (Tool Execution Service) is the strict boundary between the control and compute planes. The API Gateway is the single enforced entry point for all external traffic.


Workspace Layout

Desktop/machine/
│
├── omnibioai-studio/              # Electron desktop orchestrator — launches the full stack
│
│── Runtime data (shared across all services)
├── data/                          # Shared ecosystem runtime data — PubMed, model registry, uploads
├── work/                          # Shared ecosystem work/outputs — workflow runs, reports, coverage
│   └── out/
│       ├── reports/               # Control center ecosystem reports
│       └── coverage/              # Pytest coverage per repo
│
│── Core services
├── omnibioai/                     # Workbench — Django platform, plugins, agents
│   ├── data -> ../data            # Symlink → shared ecosystem data
│   └── work -> ../work            # Symlink → shared ecosystem work
├── omnibioai-tes/                 # Tool Execution Service — HPC/cloud/local execution
├── omnibioai-toolserver/          # FastAPI ToolServer — validated async tool APIs
├── omnibioai-tool-runtime/        # Minimal cloud-agnostic container execution runtime
├── omnibioai-model-registry/      # Production model registry — versioning, provenance
├── omnibioai-lims/                # Lightweight Django LIMS — samples, metadata
├── omnibioai-control-center/      # Health dashboard, ecosystem report, orchestration
│
│── Zero-trust security plane
├── omnibioai-api-gateway/         # Central zero-trust entry point for all service traffic
├── omnibioai-auth/                # JWT authentication, refresh tokens, RBAC
├── omnibioai-policy-engine/       # ABAC/RBAC authorization evaluation
├── omnibioai-hpc-policy-engine/   # HPC compute governance and quota enforcement
├── omnibioai-security-audit/      # Async audit logging via Redis Streams
├── omnibioai-security-sdk/        # Unified zero-trust security SDK (shared library)
├── omnibioai-iam-client/          # Async IAM client SDK for internal services
│
│── AI and developer tooling
├── omnibioai-rag/                 # RAG assistant — Hugging Face + Ollama LLMs
├── omnibioai-dev-hub/             # RAG V6 developer assistant — FAISS-native code search
├── omnibioai-dev-docker/          # GPU dev environment — CUDA, JupyterLab, Ollama
│
│── Frontend and design
├── omnibioai-ui/                  # Shared React component library
├── omnibioai-design-tokens/       # Unified CSS/JS design system tokens
├── omnibioai-landing/             # Marketing and product landing pages
├── omnibioai-docs/                # Architecture diagrams, technical docs, getting started guide
├── omnibioai-videos/              # Tutorial video library and Getting Started guide
│
│── Workflows and tool images
├── omnibioai-workflow-bundles/    # Engine-agnostic workflows — WDL, Nextflow, Snakemake, CWL
├── omnibioai-tool-images/         # ARM64 Docker/Singularity images for bioinformatics tools
│
│── SDK
├── omnibioai-sdk/                 # Python SDK — API, object registry, notebooks
│
│── Support
├── db-init/                       # Database initialisation scripts
├── utils/                         # Shared utilities
└── README.md

Services

Service Host Port Path Role
Nginx Router 80 / Reverse proxy — routes all external traffic
OmniBioAI Workbench 8000 /_svc/workbench/ UI, plugins, agents, AI tools
Auth Service 8001 /_svc/auth/ JWT authentication, refresh tokens, RBAC
Policy Engine 8002 /_svc/policy/ ABAC/RBAC authorization evaluation
HPC Policy Engine 8003 /_svc/hpc/ HPC compute governance and quota enforcement
Security Audit 8004 /_svc/audit/ Async audit logging via Redis Streams
API Gateway 8080 /_svc/gateway/ Zero-trust enforced entry point for all services
Tool Execution Service (TES) 8081 /_svc/tes/ Workflow and tool orchestration
Dev Hub API 8082 /_svc/devhub/ RAG V6 developer assistant and code search
Dev Hub UI 5173 /_svc/devhub/ RAG V6 frontend
RAG Assistant 8090 /_svc/rag/ Bioinformatics retrieval-augmented generation
Model Registry 8095 /_svc/modelregistry/ Versioned ML model artifacts
Workflow Bundles 8098 /_svc/workflows/ Engine-agnostic workflow registry
Tool Images 8097 /_svc/toolimages/ ARM64 Docker/Singularity image registry
Videos 8086 /_svc/videos/ Tutorial video library, Getting Started guide
ToolServer 9090 /_svc/toolserver/ Validated async tool APIs
Prometheus 9091 Metrics collection
LIMS 7000 /_svc/lims/ Sample and metadata management
Control Center 7070 /_svc/control/ Health dashboard, ecosystem report
Grafana 3000 Metrics dashboards
OPA 8181 Open Policy Agent
Ollama 11434 Local LLM inference
SDK 5190 /_svc/sdk/ Python SDK API server
MySQL 3306 Relational databases
Redis 6379 Celery task queue and caching
Documentation 80 /docs/ Getting started guide and architecture docs

All ports are configurable via .env.


Quick Start

Prerequisites

  • Docker Engine or Docker Desktop
  • Docker Compose v2+
  • Python 3.11+ (for report generation)

1. Clone the studio repo

git clone https://github.com/man4ish/omnibioai-studio.git
cd omnibioai-studio

2. Configure your environment

cp .env.example .env
# Edit .env and set:
# DATA_DIR=/home/youruser/Desktop/machine/data
# WORK_DIR=/home/youruser/Desktop/machine/work
# MYSQL_ROOT_PASSWORD=your_password

3. Start the full stack

docker compose up -d

First boot pulls images and initialises databases — allow 5–10 minutes.

4. Verify core services

curl http://127.0.0.1:8080/health   # API Gateway (zero-trust entry point)
curl http://127.0.0.1:8000          # Workbench
curl http://127.0.0.1:8081/health   # TES
curl http://127.0.0.1:9090/health   # ToolServer
curl http://127.0.0.1:8095/health   # Model Registry
curl http://127.0.0.1:7070/health   # Control Center
curl http://127.0.0.1:7070/summary  # Ecosystem health summary (JSON)

5. Open the Studio

Launch the Electron desktop app or open in your browser:

http://<your-server-ip>:55761   ← Electron app (desktop)
http://<your-server-ip>:80      ← Direct browser access

Getting Started Guide

An interactive getting started guide is served at:

http://<your-server-ip>/docs/

It covers:

  • Platform overview and architecture
  • Step-by-step first boot setup
  • Runtime mode selection (Beta Cloud / Local / HPC)
  • Running your first workflow via TES
  • Using the RAG / Dev Hub for code and literature search
  • Service ports and common commands
  • Troubleshooting tips

Click Getting Started in the Workbench to open it directly in your browser.


Ecosystem Report

The ecosystem report is a single interactive HTML file covering:

  • Architecture — SVG lane diagram of all services and their connections
  • Projects — Code line distribution across all repositories
  • Languages — Language breakdown across the ecosystem
  • Code Coverage — Per-repo pytest coverage with trend indicators
  • Health Status — Live service and disk health from the Control Center

Generate the Report

# With Control Center running (includes live health data)
python omnibioai-control-center/scripts/generate_report.py \
    --root ~/Desktop/machine

# Without health data (faster, offline)
python omnibioai-control-center/scripts/generate_report.py \
    --root ~/Desktop/machine \
    --skip-health

# Skip coverage collection (faster)
python omnibioai-control-center/scripts/generate_report.py \
    --root ~/Desktop/machine \
    --skip-coverage

View the Report

  • File: work/out/reports/omnibioai_ecosystem_report.html
  • Browser: Open directly in any browser
  • Control Center: http://127.0.0.1:7070/report (served live when Control Center is running)

Requirements

pip install pandas
sudo apt-get install cloc   # or brew install cloc on macOS
# pytest + pytest-cov for coverage collection (best-effort)

Control Center

The Control Center (omnibioai-control-center/) is the operational dashboard for the ecosystem:

  • GET /health — Control Center self-check
  • GET /services — Per-service health status
  • GET /summary — Full ecosystem summary (services + disk)
  • GET /report — Serves the pre-generated ecosystem HTML report

Health checks cover HTTP endpoints, TCP ports (MySQL, Redis), and disk usage thresholds. Configuration lives in omnibioai-control-center/config/control_center.yaml.


Zero-Trust Security Control Plane

Every request entering the ecosystem passes through the API Gateway (omnibioai-api-gateway, port 8080), which enforces:

  1. Authentication — JWT validation via omnibioai-auth (port 8001), with Redis-cached tokens (TTL=300s) and pub/sub invalidation on logout
  2. Authorization — ABAC/RBAC policy evaluation via omnibioai-policy-engine (port 8002)
  3. HPC Quota Governance — per-user GPU/CPU limits via omnibioai-hpc-policy-engine (port 8003)
  4. Audit Logging — every action streamed asynchronously via omnibioai-security-audit (port 8004) using Redis Streams

Fail-closed on auth/policy/HPC failure. Fail-open on audit (never blocks requests). Internal services communicate using X-Internal-Service, X-Trace-Id, and X-User-Id headers.

Swagger UI

Policy Engine and HPC Policy Engine expose interactive API docs:

http://<server>/_svc/policy/docs    ← Policy Engine
http://<server>/_svc/hpc/docs       ← HPC Policy Engine

Shared Data and Work Directories

omnibioai/data and omnibioai/work are symlinked to the ecosystem-level data/ and work/ directories, making runtime data and workflow outputs accessible to all services:

omnibioai/data  →  Desktop/machine/data/   # PubMed indexes, uploads, model registry
omnibioai/work  →  Desktop/machine/work/   # Workflow runs, reports, coverage

Docker Compose mounts both directories directly into containers so symlinks resolve correctly inside the container filesystem.

Note: This is on the feat/ecosystem-data-work-dirs feature branch pending full testing before merge to main.


Operational Modes

Mode Control Plane Compute Plane
Local dev Docker Compose (Studio) Local Docker
On-prem Docker Compose (Studio) Docker / TES
HPC External VM Apptainer via TES
Hybrid VM HPC + TES
Cloud Kubernetes Kubernetes

Key Design Principles

  • Single workspace root — all repos are siblings under one directory
  • Shared data/work — ecosystem-level directories accessible to all services
  • No absolute paths — fully portable across machines
  • Zero-trust by default — every request is authenticated and authorized
  • Strict service boundaries — control plane ≠ compute plane
  • Restart-safe orchestration — ordered startup with health checks
  • Container-native — OCI-compliant images throughout
  • Environment-driven — all configuration via .env and YAML
  • No forced cloud dependencies — runs fully offline and air-gapped
  • Engine-agnostic workflows — WDL, Nextflow, Snakemake, CWL all supported

What This Ecosystem Does Not Do

  • Does not contain bioinformatics algorithms directly (these live in plugin repos)
  • Does not vendor component repositories
  • Does not enforce a single workflow engine
  • Does not hide execution behind opaque AI calls
  • Does not require external SaaS services

Repository Index

Repository Visibility Description
omnibioai-studio Private Electron desktop orchestrator — launches the full stack
omnibioai Private Workbench — plugin-based Django platform
omnibioai-tes Private Tool Execution Service — HPC/cloud/local backends
omnibioai-toolserver Private FastAPI tool execution APIs
omnibioai-tool-runtime Private Cloud-agnostic container execution contract
omnibioai-model-registry Public Production ML model registry
omnibioai-lims Private Laboratory Information Management System
omnibioai-control-center Public Health dashboard and ecosystem report
omnibioai-api-gateway Private Zero-trust central API gateway
omnibioai-auth Private JWT authentication and RBAC service
omnibioai-policy-engine Private ABAC/RBAC authorization evaluation
omnibioai-hpc-policy-engine Private HPC compute governance and quota enforcement
omnibioai-security-audit Private Redis Streams–based audit logging
omnibioai-security-sdk Private Shared zero-trust security SDK
omnibioai-iam-client Private Async IAM client SDK
omnibioai-rag Private RAG-powered bioinformatics assistant
omnibioai-dev-hub Private RAG V6 developer assistant — FAISS-native code search
omnibioai-dev-docker Private GPU AI development environment
omnibioai-sdk Private Python SDK — v1 complete
omnibioai-workflow-bundles Private Versioned engine-agnostic workflow bundles
omnibioai-tool-images Private ARM64 Docker/Singularity bioinformatics images
omnibioai-ui Private Shared React component library
omnibioai-design-tokens Public Unified CSS/JS design system tokens
omnibioai-landing Private Marketing and product landing pages
omnibioai-docs Public Architecture diagrams, technical docs, getting started guide
omnibioai-videos Private Tutorial video library and Getting Started guide

Current Status

Component Status
Multi-service orchestration (Studio) ✅ Stable
Tool Execution Service ✅ Stable
ToolServer ✅ Stable
Tool Runtime ✅ Stable
Model Registry ✅ Stable
LIMS ✅ Stable
RAG Assistant ✅ Stable
Dev Hub (RAG V6) ✅ Stable
Python SDK ✅ v1 complete
Workflow Bundles ✅ Stable
Tool Images (ARM64) ✅ Stable
API Gateway ✅ Stable
Auth Service ✅ Stable
Policy Engine ✅ Stable — Swagger UI at /_svc/policy/docs
HPC Policy Engine ✅ Stable — Swagger UI at /_svc/hpc/docs
Security Audit ✅ Stable
Zero-trust control plane ✅ Stable
Getting Started Guide ✅ Stable — served at /docs/
Control Center 🔄 Active development
Ecosystem Report 🔄 Active development
UI Component Library 🔄 Active development
Shared data/work directories 🧪 In testing (feature branch)
Kubernetes 📋 Post-beta

License

See individual repository LICENSE files. Components are independently licensed. omnibioai-model-registry, omnibioai-control-center, and omnibioai-design-tokens are Apache 2.0.


OmniBioAI — reproducible bioinformatics at any scale, on any infrastructure.

© 2025 Manish Kumar. All rights reserved.

About

Top-level orchestration repo for the OmniBioAI platform — Docker Compose configuration wiring all 18+ microservices together, service dependency graph, environment bootstrapping, and platform-wide deployment scripts for local, on-prem, and cloud environments.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages