Update docs for v1.5.0, fix CI disk space, add new fuzz targets to CI

symbibot · symbibot · commit 20c8c3b36f7f · 2026-02-25T12:49:02.000-08:00
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -9,6 +9,11 @@ jobs:
   test:
     runs-on: ubuntu-latest
     steps:
+      - name: Free disk space
+        run: |
+          sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /opt/hostedtoolcache/CodeQL
+          sudo docker image prune --all --force
+          df -h /
       - uses: actions/checkout@v4
         with:
           submodules: recursive
@@ -24,6 +29,11 @@ jobs:
   fuzz:
     runs-on: ubuntu-latest
     steps:
+      - name: Free disk space
+        run: |
+          sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /opt/hostedtoolcache/CodeQL
+          sudo docker image prune --all --force
+          df -h /
       - uses: actions/checkout@v4
         with:
           submodules: recursive
@@ -55,7 +65,13 @@ jobs:
             tool_substitution_detection \
             dsl_structure_aware \
             sse_jsonrpc_parsing \
-            schemapin_keystore_roundtrip; do
+            schemapin_keystore_roundtrip \
+            dsl_evaluator \
+            mattermost_signature_verification \
+            crypto_roundtrip \
+            webhook_verify_generic \
+            api_key_store \
+            policy_evaluation; do
             echo "--- Fuzzing $target (15s) ---"
             cargo fuzz run --fuzz-dir . "$target" -- -max_total_time=15 || exit 1
           done
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,37 @@ All notable changes to the Symbiont project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [Unreleased]
+
+### Added
+
+#### ClawHavoc Scanner Expansion
+- **30 new detection rules** across 7 attack categories: reverse shells (7 rules), credential harvesting (6), network exfiltration (3), process injection (4), privilege escalation (5), symlink/path traversal (2), downloader chains (3)
+- **5-level severity model**: Critical, High, Medium, Warning, Info — scans fail on Critical or High findings (previously only Critical)
+- **`AllowedExecutablesOnly` custom rule type**: Whitelist-based executable filtering for strict sandboxed environments
+
+#### Agent Registry & Lifecycle
+- **Persistent `AgentRegistry`**: Store and retrieve agent metadata with delete and re-execute lifecycle support
+
+#### AGENTS.md Support
+- **Full bidirectional AGENTS.md**: Generate and parse agent manifest files for ecosystem interoperability
+
+#### Performance Verification
+- **Benchmarked performance claims**: Policy evaluation <1ms, ECDSA P-256 <5ms, SchemaPin verification <5ms, 10k agent scheduling <2% CPU overhead
+- **Debug/release threshold split**: Relaxed thresholds for debug builds (unoptimized crypto) while preserving real claims for release
+
+#### Fuzzing Expansion
+- **6 new fuzz targets**: `dsl_evaluator`, `mattermost_signature_verification`, `crypto_roundtrip`, `webhook_verify_generic`, `api_key_store`, `policy_evaluation` — total now 18 targets
+
+#### Infrastructure
+- **Docker build optimization**: cargo-chef caching, split CI/release build profiles, nproc-based parallelism auto-detection
+- **v1.6.0 roadmap**: Agent discovery, remote transport, and DSL A2A primitives planned across 5 phases
+
+### Fixed
+- **cargo-chef cook**: Create stub for `[[example]]` entries not handled by cargo-chef
+- **ECDSA benchmark threshold**: Debug builds no longer fail due to unoptimized crypto exceeding release-only 5ms threshold
+- **SchemaPin verification threshold**: Same debug/release split applied to pinned-key verification benchmark
+
 ## [1.5.0] - 2026-02-22
 
 ### Added
diff --git a/README.md b/README.md
@@ -81,18 +81,21 @@ cargo run -- mcp --port 8080
 * ⏰ **Cron Scheduling** – Persistent SQLite-backed cron engine with jitter, concurrency guards, dead-letter queues, and heartbeat pattern.
 * 🧠 **Persistent Memory** – Markdown-backed agent memory with facts, procedures, learned patterns, daily logs, and retention-based compaction.
 * 🪝 **Webhook Verification** – HMAC-SHA256 and JWT signature verification with GitHub, Stripe, and Slack presets.
-* 🛡️ **Skill Scanning** – ClawHavoc scanner with 10 rules detecting pipe-to-shell, env exfiltration, identity tampering, eval+fetch, and more.
+* 🛡️ **Skill Scanning** – ClawHavoc scanner with 40 rules across 10 attack categories (reverse shells, credential harvesting, process injection, privilege escalation, network exfiltration, and more). 5-level severity model (Critical/High/Medium/Warning/Info) with executable whitelisting.
 * 📈 **Metrics & Telemetry** – File and OTLP metric exporters with composite fan-out and background collection.
 * 🔒 **HTTP Security Hardening** – Loopback-only binding, CORS allow-lists, JWT EdDSA validation, health endpoint separation.
 * 🔒 **Sandboxing** – Tier-1 Docker isolation for agent execution.
 * 🔒 **SchemaPin Security** – Cryptographic verification of tools and schemas.
 * 🔒 **AgentPin Identity** – Domain-anchored cryptographic identity for scheduled agents.
 * 🔒 **Secrets Management** – HashiCorp Vault / OpenBao integration, AES-256-GCM encrypted storage.
 * 🔑 **Per-Agent API Keys** – Argon2-hashed API key authentication with per-IP rate limiting.
+* 🧠 **Context Compaction** – Automatic context window management with tiered compaction: LLM-driven summarization (Tier 1) and truncation (Tier 4). Multi-model token counting (OpenAI, Claude, Gemini, Llama, Mistral, and more).
 * 📊 **RAG Engine** – Vector search (LanceDB embedded) with hybrid semantic + keyword retrieval. Optional Qdrant backend for scaled deployments.
-* 🧩 **MCP Integration** – Native support for Model Context Protocol tools.
+* 🧩 **MCP Integration** – Native support for Model Context Protocol tools, plus Composio SSE integration for external tool access.
 * 📡 **Optional HTTP API** – Feature-gated REST interface for external integration.
 * 📋 **Delivery Routing** – Route scheduled agent output to webhooks, Slack, email, or custom channels.
+* 📝 **AGENTS.md Support** – Bidirectional agent manifest generation and parsing for interoperability.
+* ⚡ **Performance Verified** – Benchmarked claims: policy evaluation <1ms, ECDSA P-256 verification <5ms, 10k agent scheduling with <2% CPU overhead.
 
 ---
 
diff --git a/docs/index.md b/docs/index.md
@@ -183,27 +183,33 @@ graph TB
 
 ## Project Status
 
-### v1.4.0 Production
+### v1.5.0 Production
 
-Symbiont v1.4.0 is the latest stable release, delivering a complete AI agent framework with production-grade capabilities:
+Symbiont v1.5.0 is the latest stable release, delivering a complete AI agent framework with production-grade capabilities:
 
+- **LanceDB Embedded Vector Backend**: Zero-config vector search with no external services required. `VectorDb` trait abstraction with pluggable backends — LanceDB default, Qdrant optional via `vector-qdrant` feature flag
+- **Context Compaction Pipeline**: Automatic context window management with tiered compaction — LLM-driven summarization (Tier 1), truncation (Tier 4), and enterprise tiers for episodic compression and archival. Multi-model token counting covering OpenAI, Claude, Gemini, Llama, Mistral, and more
+- **ClawHavoc Scanner Expansion**: 40 detection rules across 10 attack categories — reverse shells, credential harvesting, process injection, privilege escalation, network exfiltration, symlink escapes, downloader chains, and more. 5-level severity model (Critical/High/Medium/Warning/Info) with `AllowedExecutablesOnly` executable whitelisting
+- **Composio MCP Integration**: Feature-gated SSE-based connection to Composio MCP server for external tool access
 - **Persistent Memory**: Markdown-backed agent memory with facts, procedures, and learned patterns — retention-based compaction, daily logs, DSL `memory` block
 - **Webhook Verification**: `SignatureVerifier` trait with HMAC-SHA256 and JWT implementations, built-in presets for GitHub, Stripe, Slack, and Custom providers — DSL `webhook` block
 - **HTTP Security Hardening**: Loopback-only default binding, explicit CORS origin allow-lists, JWT EdDSA validation, health endpoint separation
-- **Skill Scanning**: ClawHavoc scanner with 10 built-in rules detecting pipe-to-shell, env exfiltration, identity tampering, eval+fetch, base64 obfuscation, and destructive operations
 - **Metrics & Telemetry**: File and OTLP exporters with composite fan-out, background collection, `/metrics` API endpoint
 - **Scheduling Engine**: Cron-based task execution with session isolation, delivery routing, dead-letter queues, jitter, and concurrency limits
 - **Channel Adapters**: Slack (community), Microsoft Teams and Mattermost (enterprise) with webhook verification and HMAC signing
-- **HTTP Input Module**: Webhook server for external integrations with Bearer/JWT auth, rate limiting, and CORS
-- **DSL Extensions**: `schedule`, `channel`, `memory`, and `webhook` blocks for declarative agent configuration
+- **AGENTS.md Support**: Full bidirectional agent manifest generation and parsing for ecosystem interoperability
+- **Performance Benchmarks**: Verified claims — policy evaluation <1ms, ECDSA P-256 verification <5ms, 10k agent scheduling with <2% CPU overhead
+- **18 Fuzz Targets**: Comprehensive fuzzing coverage across DSL parsing, crypto, webhook verification, API keys, policy evaluation, and protocol handling
 - **AgentPin Identity**: Cryptographic agent identity verification via ES256 JWTs with domain-anchored well-known endpoints
 - **Secrets Management**: HashiCorp Vault, encrypted file, and OS keychain backends with runtime provider abstraction
 - **JavaScript & Python SDKs**: Full API clients covering scheduling, channels, webhooks, memory, skills, metrics, and more
 
-### 🔮 Planned Features
+### 🔮 v1.6.0 Roadmap — Agent-to-Agent Communication
+- Agent discovery via `AgentRegistry` and `.well-known` endpoints
+- Remote transport enabling cross-process agent communication
+- DSL A2A primitives: `send_task()`, `send_message()`, `subscribe()`, `discover_agent()`
+- AgentPin-verified AgentCards for cryptographic inter-agent trust
 - Multi-modal RAG support (images, audio, structured data)
-- Cross-agent knowledge synthesis and collaboration
-- Federated agent networks with cross-domain trust
 - Additional channel adapters (Discord, Matrix)
 
 ---
diff --git a/docs/security-model.md b/docs/security-model.md
@@ -592,6 +592,80 @@ impl SecurityAnalyzer {
 
 ---
 
+## ClawHavoc Skill Scanner
+
+The ClawHavoc scanner provides content-level defense for agent skills. Every skill file is scanned line-by-line before loading, and findings at Critical or High severity block the skill from executing.
+
+### Severity Model
+
+| Level | Action | Description |
+|-------|--------|-------------|
+| **Critical** | Fail scan | Active exploitation patterns (reverse shells, code injection) |
+| **High** | Fail scan | Credential theft, privilege escalation, process injection |
+| **Medium** | Warn | Suspicious but potentially legitimate (downloaders, symlinks) |
+| **Warning** | Warn | Low-risk indicators (env file references, chmod) |
+| **Info** | Log | Informational findings |
+
+### Detection Categories (40 Rules)
+
+**Original Defense Rules (10)**
+- `pipe-to-shell`, `wget-pipe-to-shell` — Remote code execution via piped downloads
+- `eval-with-fetch`, `fetch-with-eval` — Code injection via eval + network
+- `base64-decode-exec` — Obfuscated execution via base64 decoding
+- `soul-md-modification`, `memory-md-modification` — Identity tampering
+- `rm-rf-pattern` — Destructive filesystem operations
+- `env-file-reference`, `chmod-777` — Sensitive file access, world-writable permissions
+
+**Reverse Shells (7)** — Critical severity
+- `reverse-shell-bash`, `reverse-shell-nc`, `reverse-shell-ncat`, `reverse-shell-mkfifo`, `reverse-shell-python`, `reverse-shell-perl`, `reverse-shell-ruby`
+
+**Credential Harvesting (6)** — High severity
+- `credential-ssh-keys`, `credential-aws`, `credential-cloud-config`, `credential-browser-cookies`, `credential-keychain`, `credential-etc-shadow`
+
+**Network Exfiltration (3)** — High severity
+- `exfil-dns-tunnel`, `exfil-dev-tcp`, `exfil-nc-outbound`
+
+**Process Injection (4)** — Critical severity
+- `injection-ptrace`, `injection-ld-preload`, `injection-proc-mem`, `injection-gdb-attach`
+
+**Privilege Escalation (5)** — High severity
+- `privesc-sudo`, `privesc-setuid`, `privesc-setcap`, `privesc-chown-root`, `privesc-nsenter`
+
+**Symlink / Path Traversal (2)** — Medium severity
+- `symlink-escape`, `path-traversal-deep`
+
+**Downloader Chains (3)** — Medium severity
+- `downloader-curl-save`, `downloader-wget-save`, `downloader-chmod-exec`
+
+### Executable Whitelisting
+
+The `AllowedExecutablesOnly` rule type restricts which executables an agent skill can invoke:
+
+```rust
+// Only allow these executables — everything else is blocked
+ScanRule::AllowedExecutablesOnly(vec![
+    "python3".into(),
+    "node".into(),
+    "cargo".into(),
+])
+```
+
+### Custom Rules
+
+Domain-specific patterns can be added alongside ClawHavoc defaults:
+
+```rust
+let mut scanner = SkillScanner::new();
+scanner.add_custom_rule(
+    "block-internal-api",
+    r"internal\.corp\.example\.com",
+    ScanSeverity::High,
+    "References to internal API endpoints are not allowed in skills",
+);
+```
+
+---
+
 ## Network Security
 
 ### Secure Communication