Construct is an API-first, multi-agent coding assistant designed for superior tool calling performance through its CodeAct execution model. This document provides a comprehensive technical overview of the system's architecture, core components, and design decisions.
- System Overview
- Core Architecture
- Component Deep Dive
- Data Flow
- Security Architecture
- Technology Stack
- Design Principles
- Architectural Decisions
Construct follows a client-server architecture with a daemon-based backend and pluggable frontends. The system is designed to be API-first, meaning every operation is available through the API before any UI is built.
graph TD
subgraph "Client Layer"
CLI[CLI Client]
VSCode[VS Code Extension<br/>Future Clients]
end
subgraph "Daemon Process"
API[API Server<br/>ConnectRPC]
TR[Task Reconciler]
INT[CodeAct Interpreter]
EVT[Event System]
end
subgraph "Storage Layer"
DB[(SQLite Database<br/>Ent ORM)]
KR[System Keyring]
ENC[Encryption Layer]
end
subgraph "External Services"
ANT[Anthropic]
OAI[OpenAI]
GEM[Gemini]
OTH[Other Providers]
end
CLI -->|HTTP/2| API
VSCode -.->|HTTP/2| API
API --> TR
TR --> INT
TR --> EVT
TR --> DB
INT --> |File Ops<br/>Commands| FS[Filesystem]
EVT -->|Real-time Updates| CLI
EVT -.->|Real-time Updates| VSCode
DB --> ENC
ENC --> KR
TR --> ANT
TR --> OAI
TR --> GEM
TR --> OTH
- Client sends API requests over HTTP/2 (ConnectRPC)
- API Layer validates and routes requests
- Task Reconciler orchestrates the conversation flow
- Model Providers generate responses
- CodeAct Interpreter executes tool calls
- Event System streams updates back to clients
- Storage persists all state in SQLite
Construct runs as a user-space daemon with socket activation:
Benefits:
- Fast startup: Daemon starts on first use, not at boot
- Persistent state: Stays running between requests
- Resource efficiency: Only runs when needed
- Client independence: Any client can connect via API
- Security isolation: Credentials live only in daemon process
Platform Integration:
- Linux: systemd user service with socket activation
- macOS: launchd LaunchAgent with socket activation
Every operation is exposed through a well-defined API before any UI is built.
API Services:
AgentService- Manage agent configurationsTaskService- Manage conversation tasksMessageService- Handle messagesModelService- Configure AI modelsModelProviderService- Manage provider credentials
Communication:
- Protocol: ConnectRPC (gRPC-like, HTTP/2-based)
- Format: Protocol Buffers for type safety
- Transport: HTTP/2 for efficiency
- Streaming: Server-sent events for real-time updates
The architecture is client-agnostic. Any client can connect via the API:
- CLI: Current reference implementation (terminal-based)
- VS Code Extension: Experimental, under development
- Custom Clients: Anyone can build clients using the API
Clients only handle presentation - all logic lives in the daemon.
The Task Reconciler is the central orchestration engine that processes conversations. It's implemented as a work queue with concurrent workers.
stateDiagram-v2
[*] --> AwaitInput: No unprocessed messages
AwaitInput --> InvokeModel: User sends message
InvokeModel --> ExecuteTools: Model returns tool calls
InvokeModel --> AwaitInput: Model returns text only
ExecuteTools --> InvokeModel: Tools complete
InvokeModel --> Suspended: User suspends task
Suspended --> InvokeModel: User resumes task
Reconciliation Loop:
-
Compute Task Status
- Analyzes message history
- Determines current phase (AwaitInput, InvokeModel, ExecuteTools, Suspended)
- Identifies next unprocessed message
-
Phase: InvokeModel
- Builds message history for context
- Assembles system prompt with environment info
- Calls model provider with tools
- Streams response chunks to client in real-time
- Persists response and usage statistics
- If tool calls present → transitions to ExecuteTools
- If text only → marks complete, returns to AwaitInput
-
Phase: ExecuteTools
- Extracts tool calls from assistant message
- Executes via CodeAct interpreter (JavaScript)
- Captures output, errors, and statistics
- Persists tool results as system message
- Returns to InvokeModel to continue conversation
-
Phase: AwaitInput
- Idle state waiting for user input
- No processing occurs
-
Phase: Suspended
- User explicitly paused the task
- Cancels any in-flight model invocations
- Waits for resume signal
Concurrency:
- Configurable worker pool (default: 50 concurrent tasks)
- Each task processes independently
- Work queue ensures tasks are processed in order
- Graceful shutdown with 5-second drain timeout
Error Handling:
- Retryable errors (rate limits) → re-queue with backoff
- Provider errors → published to client
- Tool execution errors → captured and sent to model for recovery
Construct uses CodeAct - a JavaScript-based tool execution model that provides superior flexibility over traditional structured tool calls.
graph LR
A[Model Returns<br/>JavaScript Code] --> B[CodeAct Interpreter]
B --> C{Parse & Validate}
C --> D[Execute in<br/>Sobek VM]
D --> E[Tool: read_file]
D --> F[Tool: edit_file]
D --> G[Tool: execute_command]
D --> H[Tool: grep]
D --> I[Other Tools]
E --> J[Capture Results]
F --> J
G --> J
H --> J
I --> J
J --> K[Return to Model]
How It Works:
- Model generates JavaScript code that calls tools as functions
- Interpreter validates the code structure
- Sobek VM executes JavaScript in isolated environment
- Tools are injected as global functions in the VM
- Results are captured and returned to the model
Built-in Tools:
read_file(path, start_line, end_line)- Read file contentscreate_file(path, content)- Create or overwrite fileedit_file(path, diffs)- Apply targeted editslist_files(path, recursive)- List directory contentsgrep(query, path, options)- Fast regex searchfind_file(pattern, path)- Find files by name patternexecute_command(command)- Execute shell commandsprint(value)- Debug output visible only to model
Advantages over Traditional Tool Calling:
- More flexible: Can use loops, conditionals, variables
- Fewer round trips: Process multiple files in one call
- Better error handling: Try/catch blocks in code
- Clearer intent: Code shows exactly what will happen
- Easier debugging: JavaScript is human-readable
Security:
- JavaScript execution sandboxed in Sobek VM
- No access to Node.js modules or
require() - Filesystem access limited to workspace directory
- Command execution can be restricted
- Resource limits enforced (timeouts, memory)
Construct abstracts multiple AI providers behind a common interface, enabling seamless switching and redundancy.
Supported Providers:
- Anthropic - Claude models (Sonnet, Opus, Haiku)
- OpenAI - (coming soon)
- Google - (coming soon)
- xAI - (coming soon)
- AWS Bedrock - (coming soon)
Provider Abstraction:
type ModelProvider interface {
InvokeModel(ctx context.Context,
model string,
prompt string,
messages []*Message,
opts ...InvokeModelOption) (*Message, error)
}All providers implement this interface, allowing:
- Transparent switching between providers
- Unified error handling and retry logic
- Consistent streaming across all providers
- Provider-specific optimizations hidden from caller
Resilience Features:
- Exponential backoff for rate limits (1s → 10s max)
- Circuit breaker pattern (5 failures → 10s cooldown)
- Automatic retries for transient failures
- Timeout handling per provider
Cost Tracking:
- Token usage (input, output, cache reads, cache writes)
- Per-model pricing (stored in database)
- Calculated cost per message
- Aggregated cost per task
Construct uses SQLite with Ent ORM for type-safe database operations.
Schema Overview:
erDiagram
AGENT ||--o{ TASK : "executes"
AGENT }o--|| MODEL : "uses"
MODEL }o--|| MODEL_PROVIDER : "belongs to"
TASK ||--o{ MESSAGE : "contains"
AGENT {
uuid id PK
string name
string description
text instructions
uuid model_id FK
timestamp created_at
timestamp updated_at
}
TASK {
uuid id PK
uuid agent_id FK
string workspace
string phase
string desired_phase
int64 input_tokens
int64 output_tokens
float64 cost
map tool_uses
timestamp created_at
timestamp updated_at
}
MESSAGE {
uuid id PK
uuid task_id FK
string source
json content
json usage
timestamp processed_time
timestamp created_at
}
MODEL {
uuid id PK
string name
uuid provider_id FK
int64 context_window
float64 input_cost
float64 output_cost
float64 cache_read_cost
float64 cache_write_cost
}
MODEL_PROVIDER {
uuid id PK
string name
string type
string api_key_id
timestamp created_at
}
Key Entities:
- Agent: AI agent configuration (prompt, model assignment)
- Task: Conversation/work unit with resource tracking
- Message: Individual messages in conversation (user, assistant, system)
- Model: AI model configuration (costs, context window)
- ModelProvider: Provider credentials and configuration
Message Content Structure:
Messages use a flexible block-based content structure:
{
"blocks": [
{
"kind": "text",
"payload": "Hello, world!"
},
{
"kind": "code_interpreter_call",
"payload": "{\"id\": \"call_123\", \"args\": {...}}"
},
{
"kind": "code_interpreter_result",
"payload": "{\"output\": \"...\", \"error\": \"\"}"
}
]
}Block types:
text- Plain text contentcode_interpreter_call- Tool call blockcode_interpreter_result- Tool execution resultnative_tool_call- Direct tool call (future)native_tool_result- Direct tool result (future)
Why SQLite:
- Zero configuration (no server to manage)
- Single file storage (~/.construct/construct.db)
- Fast for typical workloads (single user, local access)
The event system provides real-time updates to connected clients.
Architecture:
graph TD
TR[Task Reconciler] -->|Publishes| HUB[Event Hub]
HUB -->|Subscribes by Task ID| CLI1[CLI Client 1]
HUB -->|Subscribes by Task ID| CLI2[CLI Client 2]
HUB -->|Subscribes by Task ID| VSC[VS Code Client]
TR -->|Internal Events| BUS[Event Bus]
BUS -->|Task Events| TR
BUS -->|Suspend Events| TR
Event Types:
-
Message Events - Streaming message content
- Partial text chunks (streaming response)
- Complete messages
- Tool calls
- Tool results
- Errors
-
Task Events - Task state changes
- Phase transitions (InvokeModel → ExecuteTools)
- Completion
- Suspension
Streaming Model:
- Clients subscribe to task by ID
- Server sends events as Protocol Buffer messages
- HTTP/2 streaming for efficient delivery
- Multiple clients can subscribe to same task
- Events published only to relevant task subscribers
Internal Event Bus:
Used for coordination within the daemon:
- Task Reconciler publishes task events to trigger re-processing
- Suspend events cancel in-flight operations
- Enables loose coupling between components
Simple CRUD operation:
User: construct agent create coder --model claude-sonnet-4 --prompt "..."
↓
CLI → AgentService.CreateAgent()
↓
API validates inputs (prompt length, model exists)
↓
Save to database (agents table)
↓
Return agent to CLI
↓
CLI displays confirmation
sequenceDiagram
participant User
participant CLI
participant API
participant TR as Task Reconciler
participant Provider
participant INT as Interpreter
User->>CLI: construct new --agent coder
CLI->>API: TaskService.CreateTask()
API->>TR: Queue task
TR->>TR: Status: AwaitInput
TR-->>CLI: Ready for input
User->>CLI: "Write hello world"
CLI->>API: MessageService.CreateMessage()
API->>TR: Queue task
TR->>TR: Compute status → InvokeModel
TR->>Provider: InvokeModel(messages, tools)
Provider-->>TR: Stream response chunks
TR-->>CLI: Stream to user
Provider->>TR: Tool call returned
TR->>TR: Save message, transition to ExecuteTools
TR->>INT: Execute JavaScript
INT->>INT: Run tool calls
INT-->>TR: Tool results
TR->>TR: Save tool results, transition to InvokeModel
TR->>Provider: InvokeModel(with tool results)
Provider-->>TR: Final response
TR-->>CLI: Display to user
TR->>TR: Complete, transition to AwaitInput
Every message goes through a processing lifecycle tracked by timestamps:
-
Created (
created_at)- Message saved to database
processed_timeis NULL- Task Reconciler queued
-
Processing
- Reconciler picks up unprocessed message
- For user messages → invoke model
- For assistant messages → execute tools
-
Processed (
processed_timeset)- Model invocation complete or tools executed
- Timestamp recorded
- Next message becomes active
-
Complete
- All messages processed
- Task returns to AwaitInput phase
graph TD
A[Model Returns JS Code] --> B[Reconciler Extracts Call]
B --> C[CodeAct Interpreter]
C --> D{Validate Code}
D -->|Valid| E[Create Sobek VM]
D -->|Invalid| F[Return Error]
E --> G[Inject Tool Functions]
G --> H[Execute JavaScript]
H --> I{Execution}
I -->|Success| J[Capture Output]
I -->|Error| K[Capture Error]
J --> L[Update Tool Stats]
K --> L
L --> M[Save Tool Results]
M --> N[Queue Task for Model]
N --> O[Model Processes Results]
Tool Statistics:
- Each task tracks tool usage counts
- Updated after every tool execution
- Persisted in task record:
{"read_file": 5, "edit_file": 3} - Used for analytics and debugging
User: construct resume --last
↓
CLI → TaskService.ListTasks(limit=1, order=desc)
↓
API queries most recent task
↓
CLI → TaskService.Subscribe(task_id)
↓
If task in AwaitInput → ready for user message
If task in InvokeModel/ExecuteTools → continue from where it left off
↓
User sends messages, conversation continues
Task Resume Capabilities:
- Full conversation history preserved
- Can switch agents mid-conversation
- Workspace directory remembered
- Resource usage (tokens, cost) tracked
Everything accessible via API before building UI:
- CLI is just one client - not the primary interface
- Custom tooling easy to build
- Automation-friendly for CI/CD integration
- Language-agnostic - build clients in any language
The daemon doesn't care what connects to it:
- CLI: Current reference implementation
- VS Code Extension: In development
- Custom dashboards: Track agent activity
- CI/CD integrations: Automated code review
Clear boundaries between components:
- CLI: User interaction and presentation only
- Daemon: All business logic and state management
- Providers: AI model communication
- Storage: Data persistence
Built for real-world use:
- Structured logging with context
- Metrics (Prometheus-compatible)
- Robust error handling and retries
- Graceful degradation on failures
- Resource tracking (tokens, cost, time)
- Analytics for product improvement (PostHog)
Native integration with platform features:
- System service management (systemd, launchd)
- Socket activation for automatic startup
- Credential storage (OS keyring)
- Sandboxing (systemd-run, process isolation)
-
Language: Go 1.24+
- Fast compilation and execution
- Excellent concurrency with goroutines
- Strong typing and error handling
- Great cross-platform support
-
API Framework: ConnectRPC
- Protocol Buffers for type-safe contracts
- HTTP/2 transport for efficiency
- Streaming support for real-time updates
- Language-agnostic (clients in any language)
-
Database: SQLite 3 + Ent ORM
- Embedded database (no setup required)
- Type-safe, generated Go code
- Automatic schema migrations
- Fast for single-user workloads
-
JavaScript Runtime: Sobek (Goja fork)
- Pure Go implementation of ECMAScript 5.1
- No CGo dependencies
- Sandboxed execution
- Good performance for tool calling
- CLI Framework: Cobra + Bubbletea
- Cobra: Command structure and argument parsing
- Bubbletea: Rich terminal UI, event-driven
- Composable components
- Cross-platform terminal support
-
Service Management:
- systemd (Linux)
- launchd (macOS)
-
Metrics: Prometheus client library
-
Analytics: PostHog for product insights
-
Cryptography: Tink for encryption at rest
-
Keyring: go-keyring for OS credential storage