Construct Architecture

Construct is an API-first, multi-agent coding assistant designed for superior tool calling performance through its CodeAct execution model. This document provides a comprehensive technical overview of the system's architecture, core components, and design decisions.

System Overview
Core Architecture
Component Deep Dive
Data Flow
Security Architecture
Technology Stack
Design Principles
Architectural Decisions

System Overview

Construct follows a client-server architecture with a daemon-based backend and pluggable frontends. The system is designed to be API-first, meaning every operation is available through the API before any UI is built.

graph TD
    subgraph "Client Layer"
        CLI[CLI Client]
        VSCode[VS Code Extension<br/>Future Clients]
    end
    
    subgraph "Daemon Process"
        API[API Server<br/>ConnectRPC]
        TR[Task Reconciler]
        INT[CodeAct Interpreter]
        EVT[Event System]
    end
    
    subgraph "Storage Layer"
        DB[(SQLite Database<br/>Ent ORM)]
        KR[System Keyring]
        ENC[Encryption Layer]
    end
    
    subgraph "External Services"
        ANT[Anthropic]
        OAI[OpenAI]
        GEM[Gemini]
        OTH[Other Providers]
    end
    
    CLI -->|HTTP/2| API
    VSCode -.->|HTTP/2| API
    API --> TR
    TR --> INT
    TR --> EVT
    TR --> DB
    INT --> |File Ops<br/>Commands| FS[Filesystem]
    EVT -->|Real-time Updates| CLI
    EVT -.->|Real-time Updates| VSCode
    DB --> ENC
    ENC --> KR
    TR --> ANT
    TR --> OAI
    TR --> GEM
    TR --> OTH

High-Level Flow

Client sends API requests over HTTP/2 (ConnectRPC)
API Layer validates and routes requests
Task Reconciler orchestrates the conversation flow
Model Providers generate responses
CodeAct Interpreter executes tool calls
Event System streams updates back to clients
Storage persists all state in SQLite

Core Architecture

Daemon-Based Design

Construct runs as a user-space daemon with socket activation:

Benefits:

Fast startup: Daemon starts on first use, not at boot
Persistent state: Stays running between requests
Resource efficiency: Only runs when needed
Client independence: Any client can connect via API
Security isolation: Credentials live only in daemon process

Platform Integration:

Linux: systemd user service with socket activation
macOS: launchd LaunchAgent with socket activation

API-First Architecture

Every operation is exposed through a well-defined API before any UI is built.

API Services:

AgentService - Manage agent configurations
TaskService - Manage conversation tasks
MessageService - Handle messages
ModelService - Configure AI models
ModelProviderService - Manage provider credentials

Communication:

Protocol: ConnectRPC (gRPC-like, HTTP/2-based)
Format: Protocol Buffers for type safety
Transport: HTTP/2 for efficiency
Streaming: Server-sent events for real-time updates

Multi-Client Support

The architecture is client-agnostic. Any client can connect via the API:

CLI: Current reference implementation (terminal-based)
VS Code Extension: Experimental, under development
Custom Clients: Anyone can build clients using the API

Clients only handle presentation - all logic lives in the daemon.

Component Deep Dive

Task Reconciler: The Heart of Construct

The Task Reconciler is the central orchestration engine that processes conversations. It's implemented as a work queue with concurrent workers.

stateDiagram-v2
    [*] --> AwaitInput: No unprocessed messages
    AwaitInput --> InvokeModel: User sends message
    InvokeModel --> ExecuteTools: Model returns tool calls
    InvokeModel --> AwaitInput: Model returns text only
    ExecuteTools --> InvokeModel: Tools complete
    InvokeModel --> Suspended: User suspends task
    Suspended --> InvokeModel: User resumes task

Reconciliation Loop:

Compute Task Status
- Analyzes message history
- Determines current phase (AwaitInput, InvokeModel, ExecuteTools, Suspended)
- Identifies next unprocessed message
Phase: InvokeModel
- Builds message history for context
- Assembles system prompt with environment info
- Calls model provider with tools
- Streams response chunks to client in real-time
- Persists response and usage statistics
- If tool calls present → transitions to ExecuteTools
- If text only → marks complete, returns to AwaitInput
Phase: ExecuteTools
- Extracts tool calls from assistant message
- Executes via CodeAct interpreter (JavaScript)
- Captures output, errors, and statistics
- Persists tool results as system message
- Returns to InvokeModel to continue conversation
Phase: AwaitInput
- Idle state waiting for user input
- No processing occurs
Phase: Suspended
- User explicitly paused the task
- Cancels any in-flight model invocations
- Waits for resume signal

Concurrency:

Configurable worker pool (default: 50 concurrent tasks)
Each task processes independently
Work queue ensures tasks are processed in order
Graceful shutdown with 5-second drain timeout

Error Handling:

Retryable errors (rate limits) → re-queue with backoff
Provider errors → published to client
Tool execution errors → captured and sent to model for recovery

CodeAct Tool Calling System

Construct uses CodeAct - a JavaScript-based tool execution model that provides superior flexibility over traditional structured tool calls.

graph LR
    A[Model Returns<br/>JavaScript Code] --> B[CodeAct Interpreter]
    B --> C{Parse & Validate}
    C --> D[Execute in<br/>Sobek VM]
    D --> E[Tool: read_file]
    D --> F[Tool: edit_file]
    D --> G[Tool: execute_command]
    D --> H[Tool: grep]
    D --> I[Other Tools]
    E --> J[Capture Results]
    F --> J
    G --> J
    H --> J
    I --> J
    J --> K[Return to Model]

How It Works:

Model generates JavaScript code that calls tools as functions
Interpreter validates the code structure
Sobek VM executes JavaScript in isolated environment
Tools are injected as global functions in the VM
Results are captured and returned to the model

Built-in Tools:

read_file(path, start_line, end_line) - Read file contents
create_file(path, content) - Create or overwrite file
edit_file(path, diffs) - Apply targeted edits
list_files(path, recursive) - List directory contents
grep(query, path, options) - Fast regex search
find_file(pattern, path) - Find files by name pattern
execute_command(command) - Execute shell commands
print(value) - Debug output visible only to model

Advantages over Traditional Tool Calling:

More flexible: Can use loops, conditionals, variables
Fewer round trips: Process multiple files in one call
Better error handling: Try/catch blocks in code
Clearer intent: Code shows exactly what will happen
Easier debugging: JavaScript is human-readable

Security:

JavaScript execution sandboxed in Sobek VM
No access to Node.js modules or require()
Filesystem access limited to workspace directory
Command execution can be restricted
Resource limits enforced (timeouts, memory)

Model Provider System

Construct abstracts multiple AI providers behind a common interface, enabling seamless switching and redundancy.

Supported Providers:

Anthropic - Claude models (Sonnet, Opus, Haiku)
OpenAI - (coming soon)
Google - (coming soon)
xAI - (coming soon)
AWS Bedrock - (coming soon)

Provider Abstraction:

type ModelProvider interface {
    InvokeModel(ctx context.Context, 
                model string, 
                prompt string, 
                messages []*Message, 
                opts ...InvokeModelOption) (*Message, error)
}

All providers implement this interface, allowing:

Transparent switching between providers
Unified error handling and retry logic
Consistent streaming across all providers
Provider-specific optimizations hidden from caller

Resilience Features:

Exponential backoff for rate limits (1s → 10s max)
Circuit breaker pattern (5 failures → 10s cooldown)
Automatic retries for transient failures
Timeout handling per provider

Cost Tracking:

Token usage (input, output, cache reads, cache writes)
Per-model pricing (stored in database)
Calculated cost per message
Aggregated cost per task

Storage Layer

Construct uses SQLite with Ent ORM for type-safe database operations.

Schema Overview:

erDiagram
    AGENT ||--o{ TASK : "executes"
    AGENT }o--|| MODEL : "uses"
    MODEL }o--|| MODEL_PROVIDER : "belongs to"
    TASK ||--o{ MESSAGE : "contains"
    
    AGENT {
        uuid id PK
        string name
        string description
        text instructions
        uuid model_id FK
        timestamp created_at
        timestamp updated_at
    }
    
    TASK {
        uuid id PK
        uuid agent_id FK
        string workspace
        string phase
        string desired_phase
        int64 input_tokens
        int64 output_tokens
        float64 cost
        map tool_uses
        timestamp created_at
        timestamp updated_at
    }
    
    MESSAGE {
        uuid id PK
        uuid task_id FK
        string source
        json content
        json usage
        timestamp processed_time
        timestamp created_at
    }
    
    MODEL {
        uuid id PK
        string name
        uuid provider_id FK
        int64 context_window
        float64 input_cost
        float64 output_cost
        float64 cache_read_cost
        float64 cache_write_cost
    }
    
    MODEL_PROVIDER {
        uuid id PK
        string name
        string type
        string api_key_id
        timestamp created_at
    }

Key Entities:

Agent: AI agent configuration (prompt, model assignment)
Task: Conversation/work unit with resource tracking
Message: Individual messages in conversation (user, assistant, system)
Model: AI model configuration (costs, context window)
ModelProvider: Provider credentials and configuration

Message Content Structure:

Messages use a flexible block-based content structure:

{
  "blocks": [
    {
      "kind": "text",
      "payload": "Hello, world!"
    },
    {
      "kind": "code_interpreter_call",
      "payload": "{\"id\": \"call_123\", \"args\": {...}}"
    },
    {
      "kind": "code_interpreter_result",
      "payload": "{\"output\": \"...\", \"error\": \"\"}"
    }
  ]
}

Block types:

text - Plain text content
code_interpreter_call - Tool call block
code_interpreter_result - Tool execution result
native_tool_call - Direct tool call (future)
native_tool_result - Direct tool result (future)

Why SQLite:

Zero configuration (no server to manage)
Single file storage (~/.construct/construct.db)
Fast for typical workloads (single user, local access)

Event System

The event system provides real-time updates to connected clients.

Architecture:

graph TD
    TR[Task Reconciler] -->|Publishes| HUB[Event Hub]
    HUB -->|Subscribes by Task ID| CLI1[CLI Client 1]
    HUB -->|Subscribes by Task ID| CLI2[CLI Client 2]
    HUB -->|Subscribes by Task ID| VSC[VS Code Client]
    
    TR -->|Internal Events| BUS[Event Bus]
    BUS -->|Task Events| TR
    BUS -->|Suspend Events| TR

Event Types:

Message Events - Streaming message content
- Partial text chunks (streaming response)
- Complete messages
- Tool calls
- Tool results
- Errors
Task Events - Task state changes
- Phase transitions (InvokeModel → ExecuteTools)
- Completion
- Suspension

Streaming Model:

Clients subscribe to task by ID
Server sends events as Protocol Buffer messages
HTTP/2 streaming for efficient delivery
Multiple clients can subscribe to same task
Events published only to relevant task subscribers

Internal Event Bus:

Used for coordination within the daemon:

Task Reconciler publishes task events to trigger re-processing
Suspend events cancel in-flight operations
Enables loose coupling between components

Data Flow

Creating an Agent

Simple CRUD operation:

User: construct agent create coder --model claude-sonnet-4 --prompt "..."
    ↓
CLI → AgentService.CreateAgent()
    ↓
API validates inputs (prompt length, model exists)
    ↓
Save to database (agents table)
    ↓
Return agent to CLI
    ↓
CLI displays confirmation

Starting a Conversation

sequenceDiagram
    participant User
    participant CLI
    participant API
    participant TR as Task Reconciler
    participant Provider
    participant INT as Interpreter
    
    User->>CLI: construct new --agent coder
    CLI->>API: TaskService.CreateTask()
    API->>TR: Queue task
    TR->>TR: Status: AwaitInput
    TR-->>CLI: Ready for input
    
    User->>CLI: "Write hello world"
    CLI->>API: MessageService.CreateMessage()
    API->>TR: Queue task
    
    TR->>TR: Compute status → InvokeModel
    TR->>Provider: InvokeModel(messages, tools)
    Provider-->>TR: Stream response chunks
    TR-->>CLI: Stream to user
    
    Provider->>TR: Tool call returned
    TR->>TR: Save message, transition to ExecuteTools
    TR->>INT: Execute JavaScript
    INT->>INT: Run tool calls
    INT-->>TR: Tool results
    
    TR->>TR: Save tool results, transition to InvokeModel
    TR->>Provider: InvokeModel(with tool results)
    Provider-->>TR: Final response
    TR-->>CLI: Display to user
    TR->>TR: Complete, transition to AwaitInput

Message Processing Lifecycle

Every message goes through a processing lifecycle tracked by timestamps:

Created (created_at)
- Message saved to database
- processed_time is NULL
- Task Reconciler queued
Processing
- Reconciler picks up unprocessed message
- For user messages → invoke model
- For assistant messages → execute tools
Processed (processed_time set)
- Model invocation complete or tools executed
- Timestamp recorded
- Next message becomes active
Complete
- All messages processed
- Task returns to AwaitInput phase

Tool Execution Flow

graph TD
    A[Model Returns JS Code] --> B[Reconciler Extracts Call]
    B --> C[CodeAct Interpreter]
    C --> D{Validate Code}
    D -->|Valid| E[Create Sobek VM]
    D -->|Invalid| F[Return Error]
    E --> G[Inject Tool Functions]
    G --> H[Execute JavaScript]
    H --> I{Execution}
    I -->|Success| J[Capture Output]
    I -->|Error| K[Capture Error]
    J --> L[Update Tool Stats]
    K --> L
    L --> M[Save Tool Results]
    M --> N[Queue Task for Model]
    N --> O[Model Processes Results]

Tool Statistics:

Each task tracks tool usage counts
Updated after every tool execution
Persisted in task record: {"read_file": 5, "edit_file": 3}
Used for analytics and debugging

Resuming a Task

User: construct resume --last
    ↓
CLI → TaskService.ListTasks(limit=1, order=desc)
    ↓
API queries most recent task
    ↓
CLI → TaskService.Subscribe(task_id)
    ↓
If task in AwaitInput → ready for user message
If task in InvokeModel/ExecuteTools → continue from where it left off
    ↓
User sends messages, conversation continues

Task Resume Capabilities:

Full conversation history preserved
Can switch agents mid-conversation
Workspace directory remembered
Resource usage (tokens, cost) tracked

Design Principles

1. API-First

Everything accessible via API before building UI:

CLI is just one client - not the primary interface
Custom tooling easy to build
Automation-friendly for CI/CD integration
Language-agnostic - build clients in any language

2. Client-Agnostic

The daemon doesn't care what connects to it:

CLI: Current reference implementation
VS Code Extension: In development
Custom dashboards: Track agent activity
CI/CD integrations: Automated code review

3. Separation of Concerns

Clear boundaries between components:

CLI: User interaction and presentation only
Daemon: All business logic and state management
Providers: AI model communication
Storage: Data persistence

4. Production Ready

Built for real-world use:

Structured logging with context
Metrics (Prometheus-compatible)
Robust error handling and retries
Graceful degradation on failures
Resource tracking (tokens, cost, time)
Analytics for product improvement (PostHog)

5. Cross-Platform

Native integration with platform features:

System service management (systemd, launchd)
Socket activation for automatic startup
Credential storage (OS keyring)
Sandboxing (systemd-run, process isolation)

Technology Stack

Backend

Language: Go 1.24+
- Fast compilation and execution
- Excellent concurrency with goroutines
- Strong typing and error handling
- Great cross-platform support
API Framework: ConnectRPC
- Protocol Buffers for type-safe contracts
- HTTP/2 transport for efficiency
- Streaming support for real-time updates
- Language-agnostic (clients in any language)
Database: SQLite 3 + Ent ORM
- Embedded database (no setup required)
- Type-safe, generated Go code
- Automatic schema migrations
- Fast for single-user workloads
JavaScript Runtime: Sobek (Goja fork)
- Pure Go implementation of ECMAScript 5.1
- No CGo dependencies
- Sandboxed execution
- Good performance for tool calling

Frontend

CLI Framework: Cobra + Bubbletea
- Cobra: Command structure and argument parsing
- Bubbletea: Rich terminal UI, event-driven
- Composable components
- Cross-platform terminal support

Infrastructure

Service Management:
- systemd (Linux)
- launchd (macOS)
Metrics: Prometheus client library
Analytics: PostHog for product insights
Cryptography: Tink for encryption at rest
Keyring: go-keyring for OS credential storage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Construct Architecture

Table of Contents

System Overview

High-Level Flow

Core Architecture

Daemon-Based Design

API-First Architecture

Multi-Client Support

Component Deep Dive

Task Reconciler: The Heart of Construct

CodeAct Tool Calling System

Model Provider System

Storage Layer

Event System

Data Flow

Creating an Agent

Starting a Conversation

Message Processing Lifecycle

Tool Execution Flow

Resuming a Task

Design Principles

1. API-First

2. Client-Agnostic

3. Separation of Concerns

4. Production Ready

5. Cross-Platform

Technology Stack

Backend

Frontend

Infrastructure

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Construct Architecture

Table of Contents

System Overview

High-Level Flow

Core Architecture

Daemon-Based Design

API-First Architecture

Multi-Client Support

Component Deep Dive

Task Reconciler: The Heart of Construct

CodeAct Tool Calling System

Model Provider System

Storage Layer

Event System

Data Flow

Creating an Agent

Starting a Conversation

Message Processing Lifecycle

Tool Execution Flow

Resuming a Task

Design Principles

1. API-First

2. Client-Agnostic

3. Separation of Concerns

4. Production Ready

5. Cross-Platform

Technology Stack

Backend

Frontend

Infrastructure