Configuration is loaded from node.yaml (node-level settings). Per-table configuration and assignments are stored in the database.
Settings are organized by role: shared resources (database, storage), network (grpc, health), coordinator logic (coordinator), and worker pool (worker). Per-table configuration (Kafka routing, feature flags) is managed via the admin gRPC API and stored in the database.
database:
# Primary (RW). Required for coordinator and worker roles.
primary:
host: localhost
port: 3306 # Required
database: metalog_metastore
user: root
password: ""
poolSize: 5 # Max open connections (default: 5)
poolMinIdle: 2 # Min idle connections (default: 2)
# Replica (RO, optional). Query and metadata services use this pool when
# configured. Falls back to primary if omitted. For replica-only deployments
# (API server), omit primary entirely and configure only replica.
# replica:
# host: replica-db
# port: 3306
# database: metalog_metastore
# user: reader
# password: secret
# poolSize: 10
# poolMinIdle: 2
storage:
defaultBackend: minio
backends:
minio:
endpoint: http://localhost:9000
accessKey: minioadmin
secretKey: minioadmin
region: "" # AWS region (optional, for S3)
bucket: logs # Storage bucket for this backend
forcePathStyle: true # Required for MinIO
health:
enabled: true
port: 8081 # Default: 8081
grpc:
port: 9090 # Default: 9090
ingestion: true # Requires database.primary
admin: true # Requires database.primary
query: true # Uses database.replica, falls back to primary
metadata: true # Uses database.replica, falls back to primary
coordinator:
enabled: true # Enable coordinator subsystem
nodeIdEnvVar: HOSTNAME # Env var whose value becomes node_id in _table_assignment
# Falls back to os.Hostname()
reconciliationIntervalSeconds: 60 # Default: 60. How often to reconcile table assignments.
# HA settings
haStrategy: heartbeat # "heartbeat" (default) or "lease"
heartbeatIntervalSeconds: 30 # Default: 30. Heartbeat mode: liveness write interval.
deadNodeThresholdSeconds: 180 # Default: 180. Heartbeat mode: seconds before node is declared dead.
leaseTtlSeconds: 180 # Default: 180. Lease mode: lease duration.
leaseRenewalIntervalSeconds: 30 # Default: 30. Lease mode: renewal interval (must be < leaseTtlSeconds).
# Tables are registered via admin gRPC API (AdminService/RegisterTable).
# The coordinator discovers assigned tables from the DB on startup and via
# periodic reconciliation. See docs/guides/configure-tables.md.
# Shared worker pool (claims tasks from all tables).
worker:
concurrency: 4 # 0 = workers disabled
# clpBinaryPath: /usr/bin/clp-s # Auto-resolved from $PATH if omitted
# clpProcessTimeoutSeconds: 300 # Default: 300 (5 minutes)| Property | Default | Description |
|---|---|---|
database.primary.host |
— | Primary (RW) database hostname. Required for coordinator/worker roles |
database.primary.port |
— | Required. Primary port (1-65535) |
database.primary.database |
— | Primary database name |
database.primary.user |
— | Primary database user |
database.primary.password |
— | Primary database password |
database.primary.poolSize |
5 |
Primary max open connections |
database.primary.poolMinIdle |
2 |
Primary min idle connections |
database.replica.host |
— | Replica (RO) database hostname. Optional; falls back to primary |
database.replica.port |
— | Required. Replica port (1-65535) |
database.replica.database |
— | Replica database name |
database.replica.user |
— | Replica database user |
database.replica.password |
— | Replica database password |
database.replica.poolSize |
5 |
Replica max open connections |
database.replica.poolMinIdle |
2 |
Replica min idle connections |
storage.defaultBackend |
— | Default storage backend name |
health.enabled |
false |
Enable HTTP health endpoint |
health.port |
8081 |
Health endpoint port |
grpc.port |
9090 |
gRPC server port (server starts if any service is enabled) |
grpc.ingestion |
false |
Enable ingestion gRPC service (requires database.primary) |
grpc.admin |
false |
Enable admin gRPC service (requires database.primary) |
grpc.query |
false |
Enable query gRPC service (uses database.replica, falls back to primary) |
grpc.metadata |
false |
Enable metadata gRPC service (uses database.replica, falls back to primary) |
coordinator.enabled |
false |
Enable coordinator subsystem |
coordinator.nodeIdEnvVar |
HOSTNAME |
Env var for node identity |
coordinator.reconciliationIntervalSeconds |
60 |
Table assignment reconciliation interval |
coordinator.haStrategy |
heartbeat |
HA mode: heartbeat or lease |
coordinator.heartbeatIntervalSeconds |
30 |
Heartbeat mode: liveness write interval |
coordinator.deadNodeThresholdSeconds |
180 |
Heartbeat mode: seconds before node is dead |
coordinator.leaseTtlSeconds |
180 |
Lease mode: lease duration in seconds |
coordinator.leaseRenewalIntervalSeconds |
30 |
Lease mode: renewal interval (must be < TTL) |
worker.concurrency |
0 (disabled) |
Concurrent task goroutines |
worker.clpBinaryPath |
$PATH lookup |
Path to clp-s binary. Auto-resolved from $PATH if omitted |
worker.clpProcessTimeoutSeconds |
300 |
CLP process timeout |
These are not configurable via YAML but are defined in internal/config/timeouts.go:
| Constant | Value | Description |
|---|---|---|
DefaultBatchFlushInterval |
1s | Batching writer flush interval |
DefaultBatchSize |
5000 | UPSERT batch size |
DefaultTaskClaimBatchSize |
10 | Tasks claimed per worker poll |
DefaultTaskStaleTimeout |
5m | Processing task reclaim timeout |
DefaultTaskCleanupAge |
24h | Completed/failed task cleanup age |
DefaultWorkerPollInterval |
2s | Worker task poll interval |
DefaultWorkerBackoffMax |
30s | Max backoff when no tasks available |
DefaultPlannerInterval |
60s | Consolidation planner run interval |
defaultScanInterval |
60s | Retention strategy scan interval |
defaultDeleteRate |
500/sec | Max storage object deletions per second during retention cleanup |
DefaultQueryLimit |
1000 | Default total result limit for split queries (0 = unlimited) |
DefaultQueryPageSize |
1000 | Internal SQL LIMIT per page for streaming queries |
DefaultDeadlockMaxRetries |
10 | Max retries on SQL deadlock |
All schema setup is automatic. On startup, the node creates the database, initializes the _table* registry and _task_queue, and creates per-table metadata tables. See Architecture Overview for the full startup sequence.
- Architecture Overview — System overview and startup sequence
- Quickstart — Setup and first run
- Configure Tables — Table registration and feature flags
- Port Configuration — Customizing infrastructure ports
- Performance Tuning — DSN tuning, batch size impact
- Deploy HA — Node assignment, liveness, and failover
- Scale Workers — Scaling worker nodes and
concurrencytuning - Ingestion Paths — gRPC vs Kafka ingestion, when to use each
- Write Transformers — Transformer interface and registration
- Schema Evolution — Online DDL for new fields