Your AI doesn't know you; it can't remember yesterday. Meet work buddy.
Docs • Quick Start • How It Works • Why It's Different • Features • Architecture • Contributing
work-buddy is a personal agent framework built on Claude Code and Obsidian which orchestrates tasks, manages workflows, coordinates across projects — so you can focus on your actual work. It gives your AI agent structured multi-step workflows, memory that survives across sessions, deep integration with external tooling, and a dashboard that empowers you directly!
Runs on your existing Claude Code subscription — no separate service fees. The agent you're already paying for does the work; your data stays on your machine.
90+ capabilities • 15+ structured workflows • 36 slash commands • 277 Python modules
The dashboard's Chats tab — browsing and searching across agent sessions.
- Review today's work state across notes, tasks, git, browser, and calendar before planning
- Triage 40 open Chrome tabs into close, task, group, or keep decisions
- Empty your scratchpad — quick captures get routed into tasks, references, or kept as open questions
- Run a morning routine that writes a briefing, picks your top priorities, and generates a day plan
- Keep agent sessions coordinated through dashboard threads, notifications, and approvals
- Preserve user agency — automate what's deterministic; review, approval, and steering for what's ambiguous
- Reduce coordination burden — structure the work around your work so you don't have to
- Maximize cost efficiency — only invoke the LLM when reasoning is actually required; run deterministic steps as code
- Build your own customized workflows — the same gateway and conductor you use to work are used to extend the framework
Modern knowledge work is fragmented across notes, tasks, projects, browser tabs, contracts, calendars, and ephemeral agent sessions. The problem is not just that AI forgets yesterday — that's getting solved. It's that neither the AI nor the user has a good runtime for coordinating all this state.
Without that layer, you either do the coordination work manually or let the assistant act with too little grounding and too little oversight. You end up repeating context, re-explaining priorities, and manually stitching together work that should flow smoothly every time.
work-buddy exists to close that gap by giving your agent structure (workflows and capabilities), continuity (persistent context across sessions), and reach (integrations with the tools where your work actually lives).
The design principle: automate what is deterministic, surface what is ambiguous, and preserve your agency.
work-buddy runs a local MCP server that extends Claude Code with a gateway pattern — a handful of tools that empowers your agent with many capabilities, allowing them to run complex, cohesive end-to-end workflows:
wb_search → discover what's available (natural language)
wb_run → execute a capability or start a workflow
wb_advance → step through a multi-step workflow
wb_status → check progress or system health
Capabilities are single functions. Workflows are multi-step DAGs with dependency ordering and persistent state. Both live in the knowledge store — a typed, searchable registry that agents query at runtime.
Most agent frameworks route everything through the LLM — every step, every decision, every data transformation. This is expensive, slow, and fragile. work-buddy takes a different approach: use the model as little as possible.
Workflows interleave programmatic steps (deterministic code — config loading, data formatting, API calls) with agentic steps (LLM reasoning — synthesis, judgment, user interaction). The conductor runs code steps automatically and only hands control to the agent when reasoning is actually needed.
The heuristic is simple: if you can write a unit test with a fixed expected output, it's a code step. If the "correct" output depends on interpretation, it's an agent step. The result is workflows that are faster, cheaper, more reproducible — and more powerful, because the agent's context isn't wasted on mechanical work.
Example: what a morning routine looks like
> /wb-morning
Step 1/9: [auto] Load config and resolve target date ← code
Step 2/9: [auto] Read sign-in state from journal ← code
Step 3/9: [agent] Collect and synthesize context ← reasoning
Step 4/9: [auto] Fetch contract health data ← code
Step 5/9: [auto] Pull calendar events ← code
Step 6/9: [agent] Task briefing — prioritize, flag issues ← reasoning
Step 7/9: [agent] Metacognition check — detect drift ← reasoning
Step 8/9: [agent] Generate day plan ← reasoning
Step 9/9: [auto] Write briefing to journal ← code
Five of nine steps run as deterministic code — no tokens spent, no latency, no variability. The agent only engages for the four steps that genuinely need judgment. The conductor manages the DAG, blocks on unmet dependencies, and persists state so you can resume if interrupted.
Most agent frameworks help developers build agents. work-buddy is narrower and more opinionated: it helps a person run AI-assisted knowledge work locally, against the tools and context where their work already lives.
- Unlike generic orchestration frameworks, work-buddy ships with concrete workflows for planning, task triage, backlog handling, context collection, browser triage, and project coordination.
- Unlike autonomy-first agent systems, work-buddy is built around review, approval, correction, and user steering.
- Unlike cloud-first control planes, work-buddy keeps the runtime local with strong user-facing oversight surfaces.
The dashboard is not just observability. It is part of the control loop: a place for live status, persistent threads, decision prompts, notifications, and reviewable workflow views, so you can steer the system without doing all the coordination work yourself.
| MCP Gateway | Four tools, dynamic discovery. wb_search("tasks") finds every task capability with full parameter schemas. No guessing — search first, then execute. |
| Workflow Conductor | Multi-step DAGs with dependency ordering, auto-run steps for deterministic code, execution policy (main session vs. subagent), and persistent state. Workflows chain into sub-workflows. |
| Knowledge Store | Typed JSON registry with hierarchical navigation. Agents query agent_docs at runtime — capabilities, workflows, and documentation are all discoverable in one call. |
| Human-in-the-Loop | Consent-gated operations, multi-surface notifications, persistent threads, and live observability. More below. |
| Workflow | What it does |
|---|---|
| Morning Routine | Collects fresh context, checks sign-in, reviews contracts/tasks/calendar, synthesizes a briefing, picks your top priorities, and generates a day plan. |
| Chrome Triage | Clusters and summarizes open tabs, asks clarifying questions when needed, collects user decisions, and executes approved actions. |
| Process Backlog | Walks through captured notes one thread at a time, routing each item into a task, a reference, or an open question — and leaves the unresolved rest behind as a cleaner backlog. |
| Task Triage / Weekly Review | Reviews inbox, staleness, commitments, and active work with structured follow-through. |
| Obsidian | Deep vault access via a custom bridge plugin — native integration with Tasks, Day Planner, Tag Wrangler, Smart Connections, Datacore, and Google Calendar. Not file I/O; plugin-level access. |
| Persistent Memory | Built on Hindsight. Your agent retains preferences, project context, and working patterns across sessions. Semantic search over your memory bank. |
| Telegram | Mobile command center: approve consent, resume sessions, trigger workflows, capture notes — all from your phone. |
| Chrome | Companion extension exports open tabs. Semantic clustering, content extraction, activity inference, and a four-tier triage workflow. |
| Web Dashboard | Live observability, thread conversations, session browsing, task board, notification management. Remote access via Tailscale. |
| Task Management | Full lifecycle: create, triage, assign, track, review. Weekly reviews, inbox triage, stale-task detection — all built-in workflows. |
| Contract System | Explicit work commitments with claims, evidence plans, stop rules, and Theory of Constraints bottleneck tracking. |
| Context Collection | Gather signals from git, Obsidian, conversations, Chrome, calendar into structured bundles. Agents orient on what you've been doing before deciding what to do next. |
| Metacognition | Framework for any kind of self-accountability: name the patterns you want help catching (work habits, focus, health signals, whatever you want to be held to), document them in personal knowledge, and the agent scans for them and responds with the matching intervention level. |
| Inter-Agent Messaging | Asynchronous message passing between sessions. Hand off tasks, share findings, coordinate — without human relay. |
| Project System | Project registry with identity, observations, and memory. Track decisions, pivots, blockers across time. Auto-discovery from task tags and git repos. |
| Sidecar Supervisor | Manages long-running services (messaging, embedding, Telegram, dashboard) — starts on demand, restarts on failure, health-checks on schedule. |
| Feature Toggles | Dependency-aware system lets you enable/disable subsystems based on what you have installed. Core stays lean. |
Powerful agents are only useful if you can trust them. work-buddy is built around human-in-the-loop by default — not as a safety afterthought, but as a core design choice.
Consent-gated operations. Sensitive actions — deleting tasks, pruning memory, modifying vault content — require your explicit approval before they execute. Consent requests are delivered simultaneously to every surface you have connected. Grants are session-scoped and time-limited.
Respond from anywhere. Consent requests, notifications, and decision prompts arrive on your phone (Telegram), in your knowledge base (Obsidian modals), and on the web dashboard — all at once. Respond on whichever surface is convenient; the others auto-dismiss. First response wins.
Mobile command center. From Telegram, you can approve consent requests, respond to agent questions, resume previous sessions, trigger slash commands, and capture notes — without being at your computer. Turn any agent session into a remotely supervised one.
Live observability. The web dashboard gives you a real-time view of what your agents are doing: active sessions, task state, contract health, notification queue, and full conversation history. Accessible remotely via Tailscale.
Thread conversations. Agents can open persistent chat threads on the dashboard for multi-turn discussions that outlive any single session — asking questions, reporting progress, and collecting decisions over time.
Your agents work autonomously when they can, and check in when they should. You set the boundaries.
Just wait until you see how this one works.
work-buddy builds work-buddy. The same gateway, conductor, knowledge store, and slash commands that manage your daily work are also used to extend the framework itself — tell an agent what you want, and the foundation does the heavy lifting so your idea ships instead of stalling.
Want to add a workflow? Tell your agent what you want, and it will:
- Create a
WorkflowUnitin the knowledge store with your step DAG and instructions - Register any new capabilities needed by the workflow
- Create a slash command as a thin launcher
- Run
/wb-dev-testto validate everything passes - Run
/wb-dev-pushto confirm it's ready to ship
You direct the agent. The agent writes the code. work-buddy provides the structure so neither of you gets lost.
The dev toolkit
| Command | What it does |
|---|---|
/wb-dev |
Orient on architecture, patterns, and where to look |
/wb-dev-test |
Run the right tests for what changed, check coverage, report readiness |
/wb-dev-push |
Pre-push checklist: tests, knowledge store validation, DAG integrity |
/wb-dev-retro |
Critique this session's execution, diagnose issues, hand off fixes |
/wb-task-handoff |
Package context so the next session can continue seamlessly |
graph TB
subgraph "Claude Code"
CC[Claude Code Session]
SC[Slash Commands]
end
subgraph "MCP Gateway"
GW[4 Gateway Tools]
REG[Capability Registry]
COND[Workflow Conductor]
end
subgraph "Core Services"
MSG[Messaging Service<br/>Port 5123]
EMB[Embedding Service<br/>Port 5124]
TG[Telegram Bot<br/>Port 5125]
DASH[Dashboard<br/>Port 5127]
end
subgraph "Integrations"
OBS[Obsidian Bridge<br/>Port 27125]
MEM[Hindsight Memory]
CAL[Google Calendar]
CHR[Chrome Extension]
end
subgraph "Data Layer"
VAULT[(Obsidian Vault)]
TASKS[(Task Store)]
CONTRACTS[(Contracts)]
SESSIONS[(Session Ledger)]
KNOW[(Knowledge Store)]
end
CC --> GW
SC --> GW
GW --> REG
GW --> COND
REG --> MSG & EMB & OBS & MEM & CAL & CHR
COND --> REG
OBS --> VAULT & TASKS & CAL
TG --> MSG
DASH --> MSG & TASKS & CONTRACTS & SESSIONS
MEM --> SESSIONS
REG --> KNOW
A sidecar supervisor manages long-running services — starts them on demand, restarts on failure, health-checks on schedule.
- Install and connect the MCP server
- Run
/wb-setup guided - Open the dashboard
- Try
/wb-morningor/wb-task-triage
- Claude Code (CLI or Desktop)
- Python 3.11 (via Miniforge or similar)
- Obsidian (recommended, not strictly required for core functionality)
git clone https://github.com/KadenMc/work-buddy.git
cd work-buddy
conda create -n work-buddy python=3.11 -y
conda activate work-buddy
pip install poetry
poetry install
# Optional features
poetry install --extras memory # Persistent memory (Hindsight)
poetry install --extras telegram # Telegram bot
poetry install --extras all # Everythingcp config.example.yaml config.yaml
# Edit: vault path, timezone, enabled services
cp config.local.yaml.example config.local.yaml
# Edit: machine-specific overrides (Tailscale URL, Hindsight bank, feature preferences)Machine-specific overrides (e.g., hindsight.bank_id) go in config.local.yaml (gitignored).
First-time setup: After connecting to Claude Code, run /wb-setup guided for an interactive walkthrough that validates your configuration, lets you choose which features to enable, and checks that everything is wired correctly. The wizard will flag missing requirements with fix instructions.
{
"mcpServers": {
"work-buddy": {
"command": "work-buddy-mcp",
"args": []
}
}
}Then:
> /wb-morning # Run the morning routine
> /wb-dev # Enter development mode
The default install uses CPU-only PyTorch from PyPI. If you have an NVIDIA GPU, installing CUDA-enabled PyTorch will significantly speed up embeddings, semantic search, and anything that touches sentence-transformers.
[Windows/Linux] NVIDIA CUDA
After poetry install, override torch with the CUDA wheel:
# Install CUDA-enabled PyTorch (replaces the CPU-only wheel)
pip install torch --index-url https://download.pytorch.org/whl/cu126 --force-reinstallVerify GPU access:
python -c "import torch; print(torch.cuda.is_available(), torch.cuda.get_device_name(0))"
# Expected: True NVIDIA GeForce RTX ...This replaces the CPU wheel in your virtualenv with the CUDA 12.6 build. The poetry.lock stays clean (CPU-only) so CI and other environments aren't affected.
[macOS] Apple Silicon (MPS)
The default PyPI torch wheel includes MPS support on Apple Silicon. No extra step needed.
python -c "import torch; print(torch.backends.mps.is_available())"Note:
pyproject.tomlpinspython = ">=3.11,<3.12"becausetriton(a torch dependency) doesn't declare support for Python 3.14+, and Poetry's resolver rejects ranges that could include unsupported versions.
If you installed with --extras memory, you need PostgreSQL with pgvector and the Hindsight server.
1. Install PostgreSQL and pgvector via conda
These are compiled server processes, not Python packages — install through conda:
conda install -c conda-forge postgresql pgvector -y2. Initialize PostgreSQL (first time only)
# Create a database cluster with UTF-8 encoding
initdb -D ~/hindsight-pgdata -U postgres -E UTF8 --locale=en_US.UTF-8
# Start the server
pg_ctl -D ~/hindsight-pgdata -l ~/hindsight-pgdata/logfile start
# Create the hindsight database and enable pgvector
createdb -U postgres hindsight
psql -U postgres -d hindsight -c "CREATE EXTENSION IF NOT EXISTS vector;"To stop the server later: pg_ctl -D ~/hindsight-pgdata stop
3. Configure and start Hindsight
Set environment variables (or add to a .env file):
export HINDSIGHT_API_LLM_PROVIDER=anthropic
export HINDSIGHT_API_LLM_API_KEY=<your-anthropic-api-key>
export HINDSIGHT_API_LLM_MODEL=claude-haiku-4-5-20251001
export HINDSIGHT_API_DATABASE_URL=postgresql://postgres@localhost/hindsightStart the server:
hindsight-api # Runs at http://localhost:8888Verify: curl http://localhost:8888/health
4. Bootstrap the memory bank (first time only)
python -c "from work_buddy.memory.setup import ensure_bank; ensure_bank()"This creates the personal memory bank (configured as hindsight.bank_id in config) with missions, directives, and mental models.
5. Inspect memories (optional)
To browse memories, entities, observations, and mental models in a browser:
npx @vectorize-io/hindsight-control-plane --api-url http://localhost:8888Then open http://localhost:9999. This is an on-demand inspection tool — run it when you want to browse, not always-on.
Three background services should run when using work-buddy: PostgreSQL (if using Hindsight), Hindsight API (:8888), and the WB-Sidecar (supervises messaging :5123, embedding :5124, dashboard :5127, and optionally Telegram :5125).
One-off start/stop:
# Sidecar (supervises messaging + embedding + dashboard)
conda activate work-buddy && python -m work_buddy.sidecar &
# Hindsight API (if using memory)
conda activate work-buddy && hindsight-api &
# PostgreSQL (if using memory)
pg_ctl -D ~/hindsight-pgdata -l ~/hindsight-pgdata/logfile startVerify all services:
curl http://localhost:8888/health # Hindsight API
curl http://127.0.0.1:5123/health # Messaging
curl http://127.0.0.1:5124/health # Embedding
curl http://127.0.0.1:5127/health # DashboardOr from within Claude Code: /wb-setup runs the setup wizard with automated diagnostics, requirement validation, and feature preference management. Use /wb-setup-help for targeted component diagnostics.
Auto-start on login
Run all commands in an elevated PowerShell (right-click → Run as Administrator).
Hindsight PostgreSQL
$pgAction = New-ScheduledTaskAction -Execute (Get-Command pg_ctl).Source -Argument "-D $HOME\hindsight-pgdata -l $HOME\hindsight-pgdata\logfile start"
$pgTrigger = New-ScheduledTaskTrigger -AtLogon
$pgSettings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -ExecutionTimeLimit 0
Register-ScheduledTask -TaskName "Hindsight-PostgreSQL" -Action $pgAction -Trigger $pgTrigger -Settings $pgSettings -Description "Start PostgreSQL for Hindsight memory" -RunLevel LimitedHindsight API (10s delay for PG readiness)
$hsAction = New-ScheduledTaskAction -Execute "powershell.exe" -Argument "-WindowStyle Hidden -ExecutionPolicy Bypass -Command `"conda activate work-buddy; Get-Content <repo-path>\.env | ForEach-Object { if (`$_ -match '^([^#][^=]*)=(.*)$') { [Environment]::SetEnvironmentVariable(`$matches[1].Trim(), `$matches[2].Trim(), 'Process') } }; `$env:PYTHONIOENCODING='utf-8'; hindsight-api`""
$hsTrigger = New-ScheduledTaskTrigger -AtLogon
$hsTrigger.Delay = "PT10S"
$hsSettings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -ExecutionTimeLimit 0
Register-ScheduledTask -TaskName "Hindsight-API" -Action $hsAction -Trigger $hsTrigger -Settings $hsSettings -Description "Start Hindsight memory server" -RunLevel LimitedWB-Sidecar (15s delay — supervises messaging + embedding)
$scAction = New-ScheduledTaskAction -Execute "powershell.exe" -Argument "-WindowStyle Hidden -ExecutionPolicy Bypass -Command `"conda activate work-buddy; python -m work_buddy.sidecar`""
$scTrigger = New-ScheduledTaskTrigger -AtLogon
$scTrigger.Delay = "PT15S"
$scSettings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries -ExecutionTimeLimit 0
Register-ScheduledTask -TaskName "WB-Sidecar" -Action $scAction -Trigger $scTrigger -Settings $scSettings -Description "work-buddy sidecar daemon (supervises messaging + embedding, runs scheduler)" -RunLevel LimitedCreate service files under ~/.config/systemd/user/:
hindsight-postgres.service
[Unit]
Description=PostgreSQL for Hindsight memory
[Service]
Type=forking
ExecStart=%h/miniforge3/envs/work-buddy/bin/pg_ctl -D %h/hindsight-pgdata -l %h/hindsight-pgdata/logfile start
ExecStop=%h/miniforge3/envs/work-buddy/bin/pg_ctl -D %h/hindsight-pgdata stop
Restart=on-failure
[Install]
WantedBy=default.targethindsight-api.service
[Unit]
Description=Hindsight memory API server
After=hindsight-postgres.service
[Service]
Type=simple
EnvironmentFile=%h/path-to-repo/.env
ExecStart=%h/miniforge3/envs/work-buddy/bin/hindsight-api
Restart=on-failure
RestartSec=5
[Install]
WantedBy=default.targetwb-sidecar.service
[Unit]
Description=work-buddy sidecar daemon
After=hindsight-api.service
[Service]
Type=simple
WorkingDirectory=%h/path-to-repo
ExecStart=%h/miniforge3/envs/work-buddy/bin/python -m work_buddy.sidecar
Restart=on-failure
RestartSec=10
[Install]
WantedBy=default.targetEnable and start:
systemctl --user daemon-reload
systemctl --user enable hindsight-postgres hindsight-api wb-sidecar
systemctl --user start hindsight-postgres hindsight-api wb-sidecarCreate plist files under ~/Library/LaunchAgents/. The pattern is similar to systemd — each plist specifies the program, arguments, and RunAtLoad=true. See Apple's launchd.plist(5) man page for the full schema.
The dashboard can be published privately via Tailscale:
tailscale serve --bg 5127Set dashboard.external_url in config.yaml to enable "View in dashboard" links in Telegram notifications.
| Layer | Managed by | What it provides |
|---|---|---|
| conda | conda install |
PostgreSQL server, pgvector extension |
| Poetry | poetry install |
All Python packages (hindsight, mcp, flask, etc.) |
| Environment vars | Shell config / .env |
Anthropic API key, DB URL, Telegram token |
| config.yaml | Checked into repo | Vault path, timezone, service ports, enabled features |
| config.local.yaml | Gitignored | Machine-specific overrides (bank IDs, paths) |
All 36 commands are prefixed wb- for easy discovery. Highlights:
| Command | What it does |
|---|---|
/wb-morning |
Full morning routine: close yesterday, gather context, plan today |
/wb-context-collect |
Gather signals from git, Obsidian, chats, Chrome |
/wb-task-triage |
Interactive inbox review: batch-decide on tasks |
/wb-journal-update |
Detect recent activity, append to today's journal |
/wb-meta-blindspots |
Check work against documented failure patterns |
/wb-dev |
Enter development mode with architecture orientation |
/wb-setup |
Setup wizard: validate config, choose features, diagnose issues |
/wb-task-handoff |
Create a task with full handoff context for a new session |
work_buddy/ # Python package (277 modules, ~58k LOC)
mcp_server/ # MCP gateway (4 tools, dynamic discovery)
workflow.py # DAG conductor with execution policy
dashboard/ # Web dashboard (Flask, port 5127)
messaging/ # Inter-agent messaging service
notifications/ # Multi-surface notification system
memory/ # Hindsight memory integration
telegram/ # Telegram bot sidecar
obsidian/ # Obsidian bridge + plugin integrations
knowledge/ # Typed JSON documentation store
health/ # Feature toggles, diagnostics, setup wizard, requirements
sessions/ # Conversation inspection + search
knowledge/ # Agent documentation + workflow DAGs (canonical store)
contracts/ # Work commitment tracking
.claude/commands/ # 36 slash commands (wb-* prefix)
tests/ # pytest + freezegun test suite
Each subsystem has its own README.
Layered config system:
config.yaml— project-wide settings (checked in)config.local.yaml— machine-specific overrides + feature preferences (gitignored)CLAUDE.local.md— personal behavioral instructions for your agent (gitignored)
Feature preferences live in config.local.yaml under a features: key. Set wanted: false on any component to opt out — agents won't suggest it, the dashboard hides it, and probes are skipped. Run /wb-setup preferences to manage these interactively.
Features are modular. The dependency-aware toggle system lets you enable/disable subsystems based on what you have installed.
work-buddy is pre-release software, actively developed by one person and the agents they direct. It works well for its creator's PhD research workflow, but:
- Developed on Windows 11. Linux and macOS support is new — cross-platform compatibility has been audited and the core paths are guarded, but edge cases may remain. Issues and PRs for other platforms are especially welcome.
- Setup requires some manual configuration
- Documentation assumes familiarity with Claude Code and Obsidian
- Some features are tightly coupled to the creator's specific setup
- The API surface is not yet stable
That said — this is a framework designed to be extended. If you use Claude Code and want structured workflows, persistent memory, and deep tool integration, this is built for you.
We welcome contributions — bug fixes, new capabilities, workflows, integrations, and documentation. See CONTRIBUTING.md for the full guide.
The fastest way to get started: clone the repo, install, and run /wb-dev. Your agent will orient itself.
For bugs and feature requests, open an issue.