Skip to content

marcosci/dashi

dashi

dashi

The essential base for spatial data.

ci docs site license contributions welcome discussions

A cloud-native spatial data lake — layered, infused, re-usable. Ingests any OGR/GDAL-readable geodata (vector, raster, point cloud), standardises onto a common zone model (Landing → Processed → Curated → Enrichment → Serving), catalogs everything via STAC, and serves it through SQL, COG raster tiles, and OGC API — Tiles vector tiles. Use cases: Earth observation, environmental analysis, urban planning, logistics, research — anywhere durable spatial storage with reproducible pipelines is needed.

Status

  • Phase: Phase 2 — production hardening in progress
  • Maintainers: Marco Sciaini + Johannes Schlund
  • License: Apache 2.0 — see LICENSE
  • Phase-0 PoC: ✅ Gate-1 passed
  • PoC focus domain: terrain & environment (sample data: GeoTIFF / Shapefile / GPKG / LAZ)

About the name

Dashi is the Japanese foundational broth that sits under every layered dish: kombu, bonito, water. Unseen but essential. The platform takes its name from that idea — it's the base that every downstream map, analysis, and decision is built on. Built in the open as an Apache-2.0 reference implementation.

Quick start

Browse the docs

python -m venv .venv && source .venv/bin/activate
pip install -r requirements-docs.txt
mkdocs serve                      # http://localhost:8000

Run the PoC locally

Requires Docker + k3d + kubectl on your PATH.

cd poc
make k3s-up                       # local k3d cluster named "dashi"
make storage-deploy               # RustFS + landing/processed/curated buckets
make catalog-deploy               # pgstac + stac-fastapi
make rbac-bootstrap               # per-zone scoped IAM
make serving-deploy               # TiTiler (raster tiles) + DuckDB SQL endpoint
make prefect-up                   # Prefect 3 server + worker
make monitoring-up                # Prometheus + Grafana + kube-state-metrics
make network-policies-up          # default-deny + scoped allow NetworkPolicies
make ogc-deploy                   # Martin (vector tiles) + PostGIS + PMTiles regen
make smoke                        # end-to-end acceptance checks

Full target list: make help. See poc/docs/k3s-setup.md for prerequisites and troubleshooting.

Architecture

   ┌─ web ingest ─────────────────┐    ┌─ Iceberg ────────────────────┐
   │ React+Vite SPA               │    │ tabulario/iceberg-rest       │
   │ FastAPI shim ── presign/scan │    │ s3://curated/iceberg/        │
   │            ── trigger/runs   │    │ promote_to_iceberg flow      │
   │            ── catalog detail │    └──────────────┬───────────────┘
   └────────────┬─────────────────┘                   │
                │                                      │ DuckDB iceberg ext
                ▼                                      ▼
   GeoTIFF, Shapefile,    ┌──────────────────────────────────────────────┐
   GPKG, KML, LAZ, NetCDF │  Landing → Processed → Curated → Enrichment  │
   COPC, GeoParquet, …    │  (RustFS S3 — versioned, optional ObjectLock)│
            │             └──────────────┬───────────────────────────────┘
            ▼                            │
       Prefect 3 flows                   ▼
      (dashi-ingest /         pgstac (STAC catalog, Postgres)
       dashi-retention /          + dashi:* properties:
       dashi-iceberg /              kind / classification /
       dashi-enrich)                prefect_flow_run_id+url /
            │                       enriched_title+description+keywords
            └─────────┬─────────────┘
                      ▼
   Serving      DuckDB SQL · TiTiler raster · Martin MVT · TiPG OGC-Features
                maplibre-gl-lidar pointcloud · 3D Tiles tilesets · Iceberg reads
                      │
                      ▼
   Observability      Prometheus · Grafana · Loki + promtail
   Backups            pg_dump CronJobs (× 3 DBs) → s3://backups + optional offsite
   Auth               Authelia OIDC issuer (scaffolded; oauth2-proxy template)
   Optional LLM       Ollama in dashi-llm (classification-gated enrichment)

Standardised formats: COG (raster), GeoParquet (vector), COPC (point cloud), PMTiles (tile bundles). Spatial partitioning via H3. Single processing engine via DuckDB + GDAL/PDAL.

Repository layout

dashi/
├── README.md                      # This file
├── CONTRIBUTING.md                # How to contribute
├── CODE_OF_CONDUCT.md             # Community standards
├── LICENSE                        # Apache 2.0
├── CLAUDE.md                      # Agent / AI working instructions
├── mkdocs.yml                     # MkDocs Material site config
├── docs/                          # Architecture spec + site root
│   ├── index.md                   # Public homepage
│   ├── 01-summary.md … 10-risks-open-questions.md
│   ├── FEATURE-IDEAS.md           # Backlog for new ideas
│   └── assets/                    # Logo, favicon, brand tokens
├── adr/                           # Architecture Decision Records
├── poc/                           # PoC — k3s manifests + ingest + flows
│   ├── ingest/                    # dashi-ingest (Python, format-agnostic)
│   ├── manifests/                 # K8s manifests per component
│   ├── flows/                     # Prefect flows
│   └── smoke/                     # End-to-end acceptance checks
├── agents/                        # Task briefs for AI agents
└── templates/                     # Doc templates (ADR, requirement, risk)

Documentation map

Chapter Topic
01 Zusammenfassung — executive summary
02 Kontext & Motivation
03 Ziele & Nicht-Ziele
04 Stakeholder & Rollen (RACI)
05 Funktionale + nicht-funktionale Anforderungen
06 Ist-Zustand (Greenfield)
07 Zonenmodell
08 ADR-Übersicht
09 Phasenplan
10 Offene Fragen & Risikoregister
Phase-0-Roadmap · Phase-2-Roadmap Active work tracks
FEATURE-IDEAS Backlog of future ideas
Glossary · ID reference Lookups

Contributing

Contributions are welcome — see CONTRIBUTING.md for the workflow, code style, and how to file issues. By participating you agree to follow the Code of Conduct.

Quick paths:

Working language

The architecture chapters preserve the original spec language (German). Public-facing surfaces — README, code, commit messages, agent instructions — are English. New docs may be written in either.

License

Apache License 2.0 — see LICENSE. Copyright © 2026 the dashi contributors.

About

A cloud-native spatial data lake — STAC catalog, COG / GeoParquet / COPC / PMTiles, served by TiTiler / Martin / TiPG / DuckDB. Apache-2.0.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors