SAGA

SAGA (Semantic And Graph-enhanced Authoring) is a local-first, single-user Python CLI for AI-driven long-form fiction generation. It uses:

LangGraph for workflow orchestration (checkpointed, resumable)
Neo4j as a persistent knowledge graph to store canon (characters, locations, relationships, events)
The local filesystem for human-readable artifacts and outputs

Core philosophy: single machine, no web app, no microservices (see docs/PROJECT_CONSTRAINTS.md).

For writers: what you get

SAGA's goal is to help you generate a novel while staying consistent with established story facts.

Running it produces (under projects/{story_title}/):

Drafted and finalized chapters (projects/{story_title}/chapters/)
Chapter summaries (projects/{story_title}/summaries/)
Exportable compiled manuscript (projects/{story_title}/exports/)
Human-readable YAML artifacts (projects/{story_title}/characters/, projects/{story_title}/world/, projects/{story_title}/outline/)
Resumable workflow checkpoints (projects/{story_title}/.saga/checkpoints.db)

The intended design is: your canon lives in Neo4j, while your artifacts live in files, so you can inspect and version outputs easily.

Status: not production-ready

SAGA currently has known critical issues and is not production-ready.

Screenshots

Progress window (Rich CLI):

Example KG snapshot (5 chapters):

Developer quickstart

Prerequisites

Python 3.12 (see pyproject.toml)
A running Neo4j instance (Docker is provided via docker-compose.yml)
A local OpenAI-compatible LLM endpoint (for text generation)
A local embeddings endpoint

Setup

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

Edit your .env to match your local services (see .env.example).

Start Neo4j (Docker)

docker-compose up -d

Neo4j defaults in this repo:

user: neo4j
password: saga_password

Run SAGA's bootstrap mode with a high-level plot specification (see `docs/bootstrapper.md`)

python main.py bootstrap "A suspenseful thriller set inside a deep-sea facility at the bottom of the ocean"

Review your projects/{story_title}/config.json to edit SAGA's generated content for the high-level plot specification if desired (optional).

Start SAGA's generation mode using the information from bootstrap mode

python main.py generate

SAGA can also be started directly with the high-level plot specification to auto-accept LLM choices

python main.py generate "A suspenseful thriller set inside a deep-sea facility at the bottom of the ocean"

Configuration

SAGA uses Pydantic settings loaded from .env (see config/settings.py).

Key environment variables (examples in .env.example):

OPENAI_API_BASE: OpenAI-compatible base URL (example: http://127.0.0.1:8080/v1)
OPENAI_API_KEY: token (can be dummy for purely local gateways)
EMBEDDING_API_BASE: embeddings endpoint base URL (example: http://127.0.0.1:11434)
EMBEDDING_MODEL: embedding model name
EXPECTED_EMBEDDING_DIM and NEO4J_VECTOR_DIMENSIONS: must match your embedding model's output dimension

Important:

Defaults in config/settings.py assume a 1024-dim embedding model unless overridden.
The sample .env.example uses 768.

Keep embedding dimensions consistent across: EXPECTED_EMBEDDING_DIM, NEO4J_VECTOR_DIMENSIONS, and your embedding model.

Local LLM endpoint expectation

SAGA assumes generation is done via an OpenAI-compatible HTTP API (configured via OPENAI_API_BASE).

Embeddings are configured separately via EMBEDDING_API_BASE. Any embedding service that matches your configured model and dimension works.

Outputs

High-level output layout:

projects/{story_title}/
├── .saga/
│   ├── checkpoints.db          # LangGraph SQLite checkpoints
│   ├── logs/                   # Runtime logs per chapter
│   └── content/                # Externalized content (blobs)
│       ├── drafts/
│       ├── outlines/
│       ├── summaries/
│       ├── extractions/
│       └── embeddings/
├── chapters/            # Finalized chapters (markdown)
│   └── chapter_01.md
├── summaries/           # Per-chapter summaries
│   └── chapter_01_summary.txt
├── outline/             # Story/act/chapter outline YAML
│   ├── structure.yaml          # Global act/plot structure
│   └── beats.yaml              # Detailed outline per act/chapter
├── characters/          # Character profiles YAML
│   └── protagonist.yaml
├── world/               # World items + rules YAML
│   └── items.yaml
└── exports/             # Compiled manuscript exports
    └── novel_full.md

How it works (high level)

SAGA uses a two-phase workflow:

1. Initialization Phase

Runs once per project to establish the narrative foundation:

Generate character sheets for main characters
Create a global outline with act structure (3 or 5 acts)
Expand into detailed act outlines with chapter-level beats
Commit initialization to Neo4j knowledge graph
Persist artifacts as YAML/Markdown files

2. Generation Loop (Chapter-by-Chapter)

Repeats for each chapter:

Chapter Planning: Generate a detailed scene-by-scene outline
Scene-level Generation:
- Plan scenes from the chapter outline
- Retrieve relevant context from Neo4j (characters, relationships, events)
- Generate prose for each scene individually
Extraction: Extract entities and relationships from each scene, then consolidate
Embedding Generation: Create vector embeddings for the chapter
Relationship Normalization: Map extracted relationships to canonical types
Commit to Graph: Persist extracted information to Neo4j with deduplication
Validation:
- Consistency checks (relationship validation, trait consistency, plot advancement)
- LLM-based quality evaluation (coherence, prose quality, pacing, tone)
- Contradiction detection (abrupt relationship changes, timeline issues)
Revision Loop: If validation fails, revise the chapter and re-enter extraction
Finalization: Generate summary, write final markdown, perform graph healing
Quality Assurance: Final checks before advancing to next chapter

Key Features

Scene-level generation: Chapters are broken into scenes, each generated with relevant context from the knowledge graph
Content externalization: Large text blobs stored on disk via ContentRef to keep checkpoints lightweight
Graph healing: Automated maintenance to merge duplicates and enrich provisional nodes
LLM-based quality evaluation: Automatic scoring of coherence, prose quality, pacing, and tone
Extended contradiction detection: Relationship evolution checks and graph consistency validation
Bootstrap capability: Converts high-level prompts into structured project configurations via python main.py bootstrap "A western about a robot sheriff", with optional review step before generation

For the deep dive, start here:

Architecture overview: docs/langgraph-architecture.md
Workflow walkthrough: docs/WORKFLOW_WALKTHROUGH.md
Workflow visualization tooling: docs/WORKFLOW_VISUALIZATION.md

Troubleshooting

Logs and debug artifacts:
- Check projects/{story_title}/.saga/logs/ and projects/{story_title}/.saga/content/
Neo4j reset (destructive):
- Run python reset_neo4j.py
Neo4j connection failures:
- Confirm NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD match .env.example

License

Apache-2.0 - See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1,301 Commits
config		config
core		core
data_access		data_access
docs		docs
models		models
orchestration		orchestration
processing		processing
prompts		prompts
tests		tests
ui		ui
utils		utils
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SAGA-KG-Ch5.png		SAGA-KG-Ch5.png
SAGA.png		SAGA.png
backfill_entity_embeddings.py		backfill_entity_embeddings.py
cleanup.sh		cleanup.sh
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
reset_neo4j.py		reset_neo4j.py
visualize_workflow.py		visualize_workflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAGA

For writers: what you get

Status: not production-ready

Screenshots

Developer quickstart

Prerequisites

Setup

Start Neo4j (Docker)

Run SAGA's bootstrap mode with a high-level plot specification (see `docs/bootstrapper.md`)

Start SAGA's generation mode using the information from bootstrap mode

SAGA can also be started directly with the high-level plot specification to auto-accept LLM choices

Configuration

Local LLM endpoint expectation

Outputs

How it works (high level)

1. Initialization Phase

2. Generation Loop (Chapter-by-Chapter)

Key Features

Troubleshooting

License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SAGA

For writers: what you get

Status: not production-ready

Screenshots

Developer quickstart

Prerequisites

Setup

Start Neo4j (Docker)

Run SAGA's bootstrap mode with a high-level plot specification (see docs/bootstrapper.md)

Start SAGA's generation mode using the information from bootstrap mode

SAGA can also be started directly with the high-level plot specification to auto-accept LLM choices

Configuration

Local LLM endpoint expectation

Outputs

How it works (high level)

1. Initialization Phase

2. Generation Loop (Chapter-by-Chapter)

Key Features

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages

Run SAGA's bootstrap mode with a high-level plot specification (see `docs/bootstrapper.md`)