GitHub - mattmezza/humux: 🤖 humux — human multiplexer, your self-hosted personal AI agent

Human Multiplexer — your self-hosted personal AI agent.
humux.dev · Features · Quick Start · Docs

humux is a self-hosted personal AI agent that runs in a single Docker container. It multiplexes across all the channels of your digital life — Telegram, email, calendar, contacts, WhatsApp — into one unified, autonomous intelligence. It remembers, plans, acts, and speaks.

No cloud dependency. No data leaving your server. One docker compose up and you have your own AI.

✨ Features

Monorepo structure

Directory	Contents
`humux/`	The agent application (Python, FastAPI, Docker)
`docs/`	Documentation site (Next.js, Fumadocs)
`www/`	Marketing website (HTML, Tailwind CSS v4)

Messaging — Talk to your agent wherever you are

Telegram — full bot with text, voice messages, reactions, inline approvals
WhatsApp — read and send via wacli CLI, link once and it stays authenticated
Multi-agent groups — several agent-bots share one Telegram group, each replies only when addressed, never loops with other bots
Reply decision — in group chats the agent decides per message whether to reply, with a hard rate cap that guarantees runaway loops end
Steerable mid-turn — redirect a long-running turn by just sending a follow-up; it's folded in before the agent's next step (reacts 👀 to confirm) instead of making you wait for it to finish on the wrong track
Per-chat settings — gate per Telegram chat who can trigger an agent and who may DM it

Agents — Swappable identities, each with its own bot

Each agent has its own character, skill/tool scope, voice, and email/calendar/contacts accounts
Each agent runs its own Telegram bot — several run concurrently as separate contacts
Agents are created and configured through the admin UI, no code needed
Per-agent tool identities — own gh token, own browser profile (#93)

Email — Read, compose, and route

Powered by Himalaya CLI (Rust — fast, stateless, JSON output)
Multi-account: Gmail, Fastmail, iCloud, or any IMAP/SMTP provider
Each agent can own a dedicated mailbox or be granted read/read-write access
Credentials resolve from the encrypted vault — never reach the model's context

Calendar & Contacts — Your schedule, your address book

CalDAV — Google Calendar, iCloud, any CalDAV server
Contacts — CardDAV (Purelymail, iCloud, Fastmail) and Google Contacts
Both bindable per-agent with read / read-write access levels

Memory — Four-tier persistent memory that learns and forgets

Tier	What	How
T1 Lexical	Word-overlap retrieval	Always-on, zero deps
T2 Semantic	Embedding vectors (fastembed, on-device)	Relevance-ranked injection
T3 Forgetting	Importance score + access recency	Cold memories archive automatically
T4 Hygiene	Cluster + merge near-duplicates	Self-healing compaction

Memories are extracted automatically from conversations. The agent reads AND writes them via sqlite3 CLI through the same skill system.

Scheduled tasks — Proactive, not just reactive

Cron-based jobs for morning briefings, email checks, memory consolidation
Subagent jobs: delegate recurring work to a named agent
One-shot tasks via Telegram (/jobs) or the admin UI

Subagents — Delegate subtasks to scoped sub-loops

Spawn a sub-loop under any agent, on demand or scheduled
Scope is a subset of the caller's — inherit-never-widen for tools, skills, secrets, and GitHub repo access
Runs sync (result returned in-turn) or background (distilled summary)
Monitor and cancel from Telegram or admin UI

Voice — Speak to your agent, hear it reply

STT: faster-whisper — local, offline, multi-language
TTS: edge-tts (cloud) or Kokoro 82M (fully offline, multilingual)
Voice marker syntax lets the model request a spoken reply per-turn
Per-agent voice selection

Secrets vault — Encrypted, two-tier, never in context

Vault	Key	Unseals
Infra vault	Machine key (`HUMUX_MASTER_KEY` / `data/master.key`)	At boot, headless — for provider keys, bot tokens
Agent vault	Admin password (envelope encryption)	On login — for website logins, payment keys

Secrets are referenced as ${vault:NAME} in config and {{secret:NAME}} in commands — the model never sees the value. Bitwarden import + secure-link credential requests included.

Permissions — You're always in control

Glob-pattern rules: ALWAYS / ASK / NEVER
Write actions ask for Telegram approval with context preview
Per-agent tool scoping — an agent can only use what you give it
Per-agent GitHub repo allowlist — an agent can only touch repos you authorize

Browser automation — The agent can browse the web

Optional headless Chromium (Playwright) for JS-heavy pages
Self-driving explore mode — an inner LLM loop navigates sites, fills forms, clicks buttons until done
Persistent logged-in profiles (cookies survive between calls)
Per-domain action rules (Allow / Ask / Block)

Web artifacts — Publish pages, dashboards, documents

The agent writes files under {workspace}/artifacts/<slug>/ with the coding harness
Served as shareable links at /artifacts/<slug>/ behind a sandbox CSP
Multi-file sites, PDFs, images — anything you can write to disk

Image generation — Visual answers

Optional generate_image tool (OpenRouter, fal.ai, or OpenAI)
Reuses your existing LLM API key for OpenRouter/OpenAI
Daily/monthly budget caps

Coding harness — The agent works on real code

read_file, write_file, edit_file, list_dir, grep, run_command_in_dir
Confined to one configurable workspace directory — path traversal blocked
run_command shares that root, so a cloned repo is readable, editable and committable in place
Each agent namespaces its files under a <slug>/ subdirectory
Reads pre-approved, writes ask permission

Admin UI — Full web dashboard

Configuration, agents, skills editor, memory inspection, job management
Per-agent log streams, filterable by stream / level / time / text
Agent lifecycle control (start/stop/restart)
Setup wizard for first boot
Built with FastAPI + HTMX + Alpine.js + Tailwind CSS v4

🚀 Quick Start

Prerequisites

Docker and Docker Compose
An Anthropic, OpenAI, or DeepSeek API key
A Telegram bot token (optional but recommended)

1. Clone and configure

git clone https://github.com/mattmezza/mpa.git
cd mpa/humux
cp .env.example .env
cp config.yml.example config.yml
cp character.md.example character.md

Edit .env with your API keys. Edit config.yml to customize the agent name, owner, channels, and scheduled jobs.

2. Run with Docker Compose

cd mpa/humux
docker compose up -d

The admin UI is at http://localhost:8000. On first boot, humux starts in setup mode — a wizard walks you through the initial configuration.

3. Run without Docker (development)

Requires Python 3.14+ and uv.

cd mpa/humux
make setup       # creates venv, installs deps, copies example configs
make run         # starts the agent

4. Chat from the terminal (the CLI channel)

cd mpa/humux
make cli        # interactive CLI — type your messages, see the agent think
make cli AGENT=my-agent  # chat as a specific agent
make cli YOLO=1          # auto-approve all permissions (local testing)

The CLI is a first-class channel — the same agent, same data, same tools as Telegram, just at your terminal. It's also reachable remotely over the deploy host's SSH (ssh -t user@host docker exec -it humux uv run python -m core.cli), supports resumable sessions (--session / --sessions / --rm-session), and runs concurrently with the live server (all databases run in WAL mode). See the Channels → CLI docs for details.

🏗️ Architecture

humux follows a Python orchestrator + CLI tools design. Python handles the async LLM loop, the admin web UI, and orchestration. Battle-tested CLI tools handle protocol complexity:

Concern	Tool
LLM	Anthropic Claude, OpenAI, Grok (xAI), Google, DeepSeek, OpenRouter
Email	Himalaya CLI (Rust)
Contacts	Built-in contacts CLI (CardDAV + Google People)
Calendar	python-caldav
WhatsApp	wacli (Go)
Browser	Playwright (Chromium)
Voice STT	faster-whisper (CTranslate2)
Voice TTS	edge-tts / Kokoro 82M
Web search	Tavily
Scheduler	APScheduler
Storage	SQLite (8 databases)
Admin UI	FastAPI + Jinja2 + HTMX + Alpine.js + Tailwind CSS v4

The skill system

Instead of hardcoded integrations, the agent learns to use CLI tools via markdown skill files stored in SQLite. Skills are injected into the LLM's context on-demand during conversations. Adding a new capability means:

Install the CLI tool
Write a markdown file teaching the agent how to use it
Add the command prefix to the executor whitelist

No Python code. No redeploy. The agent picks it up on the next turn.

Project structure

humux/
├── core/             Core agent modules
│   ├── agent.py          LLM tool-use loop with agentic reasoning
│   ├── llm.py            Multi-provider LLM client abstraction
│   ├── memory.py         Four-tier memory extraction + consolidation
│   ├── config.py         Pydantic config models, YAML/env loader
│   ├── config_store.py   SQLite-backed config store + setup wizard
│   ├── executor.py       CLI command executor with prefix whitelist
│   ├── permissions.py    Glob-pattern permission engine
│   ├── skills.py         SQLite-backed skills store + lazy loading
│   ├── scheduler.py      APScheduler wrapper for cron/one-shot jobs
│   ├── subagents.py      Scoped sub-loop delegation
│   ├── vault.py          Encryption primitives + key management
│   ├── secret_store.py   SQLite-backed secrets vault storage + ACL
│   ├── artifacts.py      Web artifact serving (sandboxed)
│   ├── coding.py         Confined workspace file tools
│   ├── compaction.py     Conversation compaction for session history
│   ├── github_app.py     GitHub App JWT minting + installation tokens
│   ├── history.py        Conversation history persistence
│   ├── imagegen.py       Image generation with budget caps
│   ├── job_store.py      Scheduled job persistence
│   ├── log_streams.py    Per-agent structured log streaming
│   ├── agents.py         Agent definitions, CRUD, markdown parsing
│   ├── embeddings.py     Local (fastembed) + remote embeddings
│   ├── goal_decomposition.py  Task breakdown for complex requests
│   ├── task_reflection.py     Post-task reflection store
│   └── reply_decision.py      Group-chat reply gate
├── channels/         Communication channels
│   └── telegram.py       Telegram bot (text, voice, approvals)
├── api/              Admin web interface
│   ├── admin.py          FastAPI routes + HTMX partials
│   ├── templates/        Jinja2 templates
│   └── static/           Tailwind CSS
├── voice/            Voice pipeline
│   └── pipeline.py       Whisper STT + edge-tts/Kokoro TTS
├── tools/            CLI helper scripts
│   ├── calendar_read.py  CalDAV event reader
│   ├── calendar_write.py CalDAV event creator
│   ├── contacts.py       CardDAV/Google Contacts client
│   ├── browser.py        Headless browser automation (Playwright)
│   └── skills.py         Skills management CLI
├── skills/           Markdown skill files (seed → SQLite)
├── schema/           SQL schema files
└── tests/            Test suite (pytest + asyncio + xdist)
docs/             Documentation site (Next.js + Fumadocs)
www/              Marketing site (humux.dev)

⚙️ Configuration

humux uses a dual-layer config system:

config.yml + .env — File-based seed config loaded on first boot. Supports ${ENV_VAR} interpolation and ${vault:NAME} for secrets.
SQLite config store (data/config.db) — Becomes the source of truth after setup. Managed through the admin UI.

Key files

File	Purpose
`.env`	API keys and secrets
`config.yml`	Agent settings, channels, calendar, scheduler jobs
`character.md`	Agent identity, personality, and communication style
`skills/*.md`	Skill documents that teach the agent how to use tools
`agents/*.md`	Optional agent-definition seed files

📦 Tech Stack

Category	Technology
Language	Python 3.14+ with uv
LLM providers	Anthropic Claude, OpenAI, Google Gemini, Grok (xAI), DeepSeek, OpenRouter (any OpenAI-compatible)
Messaging	python-telegram-bot, wacli (WhatsApp)
Persistence	SQLite (8 databases: config, skills, agents, history, memory, reflections, jobs, imagegen)
Admin UI	FastAPI + Jinja2 + HTMX + Alpine.js + Tailwind CSS v4
Voice	faster-whisper (STT), edge-tts / Kokoro 82M (TTS)
Browser	Playwright (Chromium), headless or CDP sidecar
Scheduler	APScheduler
Search	Tavily
Container	Docker (single image, multi-stage)
CI/CD	GitHub Actions (lint, test, build, publish to ghcr.io)

💬 Community & Support

humux.dev — Marketing site
Documentation — Full docs with guides and API reference
GitHub Issues — Bug reports and feature requests
Discussions — Questions and community help

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Run make lint and make test
Open a pull request

See Development docs for setup instructions.

humux — Human Multiplexer

Built with ❤️ by Matteo Merola

Name		Name	Last commit message	Last commit date
Latest commit History 708 Commits
.githooks		.githooks
.github		.github
docs		docs
humux		humux
www		www
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ Features

Monorepo structure

🚀 Quick Start

Prerequisites

1. Clone and configure

2. Run with Docker Compose

3. Run without Docker (development)

4. Chat from the terminal (the CLI channel)

🏗️ Architecture

The skill system

Project structure

⚙️ Configuration

Key files

📦 Tech Stack

💬 Community & Support

🤝 Contributing

About

Uh oh!

Releases 18

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

✨ Features

Monorepo structure

🚀 Quick Start

Prerequisites

1. Clone and configure

2. Run with Docker Compose

3. Run without Docker (development)

4. Chat from the terminal (the CLI channel)

🏗️ Architecture

The skill system

Project structure

⚙️ Configuration

Key files

📦 Tech Stack

💬 Community & Support

🤝 Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 18

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages