Skip to content

feat(ledger): add v3 reconciler with Raft StatefulSet support#434

Open
gfyrag wants to merge 9 commits intomainfrom
feat/ledger-v3-support
Open

feat(ledger): add v3 reconciler with Raft StatefulSet support#434
gfyrag wants to merge 9 commits intomainfrom
feat/ledger-v3-support

Conversation

@gfyrag
Copy link
Copy Markdown
Contributor

@gfyrag gfyrag commented Mar 25, 2026

Summary

  • Add version-branched reconciler for Ledger v3 (Raft consensus + Pebble embedded storage)
  • v3 deploys a StatefulSet (instead of Deployment) with a headless service for Raft peer discovery
  • Gateway compatibility: ClusterIP service maps port 8080 → container port 9000
  • All v3-specific config (replicas, PVC sizes, Pebble/Raft tunables) driven by Settings CRD — no changes to the Ledger CRD type

What's new

Resource Purpose
StatefulSet/ledger Raft cluster nodes with OrderedReady policy
Service/ledger-raft (headless) Peer DNS discovery for Raft
Service/ledger (ClusterIP) Gateway-facing, port 8080→9000
3 PVCs per pod wal (5Gi), data (10Gi), cold-cache (10Gi)

Settings keys

Key Default Description
ledger.v3.replicas 3 Raft node count (must be odd)
ledger.v3.cluster-id "default" Cluster ID
ledger.v3.persistence.{wal,data,cold-cache}.size 5/10/10 Gi PVC sizes
ledger.v3.persistence.{wal,data,cold-cache}.storage-class "" Storage classes
ledger.v3.pebble.* "" Pebble tunables
ledger.v3.raft.* "" Raft tunables

Version gate

isV3 := !semver.IsValid(version) || semver.Compare(version, "v3.0.0-alpha") >= 0

v2 code path is completely untouched.

Test plan

  • go build ./... and go vet ./... pass
  • Create a Ledger CR with version: v3.0.0 in a Stack, verify StatefulSet, headless Service, ClusterIP Service, and PVCs are created
  • Verify gateway routes traffic to ledger via port 8080
  • Verify existing v2 Ledger CRs still reconcile correctly (no regression)
  • Test with odd replica counts (1, 3, 5) and verify even counts are rejected

🤖 Generated with Claude Code

Ledger v3 uses Raft consensus with Pebble embedded storage instead of
PostgreSQL. This adds a version-branched reconciler that creates a
StatefulSet (with headless service for peer discovery) when the ledger
version is >= v3.0.0-alpha.

Key changes:
- Version gate in Reconcile(): v3+ skips Database/migrations entirely
- StatefulSet with OrderedReady policy, 3 PVCs (wal, data, cold-cache)
- Headless service (ledger-raft) for Raft peer DNS discovery
- ClusterIP service mapping port 8080→9000 for gateway compatibility
- Pod entrypoint script: computes node-id from ordinal, bootstrap/join
- Settings-driven: replicas, PVC sizes, storage classes, Pebble/Raft tunables
- RBAC marker for apps/statefulsets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gfyrag gfyrag requested a review from a team as a code owner March 25, 2026 15:58
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

When a Stack semver-major is v3, reconciliation delegates to a new v3 flow that resolves a v3 image, creates a Gateway and a headless Raft Service, provisions a StatefulSet with per-pod PVCs, probes, env/command wiring and mirror provisioning Job; RBAC and owner metadata updated to include StatefulSets.

Changes

Cohort / File(s) Summary
Version-gated reconciliation & RBAC
internal/resources/ledgers/init.go
Reconcile no longer always finishes after install; parses modules.ledger.v3-mirror setting, may call reconcileV3(...). Registers apps/v1.StatefulSet as owner for v1beta1.Ledger and adds apps/statefulsets kubebuilder RBAC verbs.
V3 reconciliation implementation
internal/resources/ledgers/v3.go
New v3 reconciler: resolves v3 image, creates Gateway/health API, creates headless Raft Service ledger-raft, validates odd replica count, builds StatefulSet ledger with ordered-ready management, three PVC claim templates (wal,data,cold-cache), PodTemplate (ports, probes, env, mounts, preStop), bootstrap/join command logic, and a Job to provision mirror ledgers.
Tests: Ledger v3 controller
internal/tests/ledger_v3_controller_test.go
New Ginkgo/Gomega tests verifying v3 behavior: StatefulSet ownership/config (3 replicas default, OrderedReady, svcName), PVC templates, container ports/probes/lifecycle/env, headless Service with ClusterIP None and PublishNotReadyAddresses, GatewayHTTPAPI health endpoint, and settings-driven overrides.
Docs: Ledger v3 & settings reference
docs/04-Modules/03-Ledger.md, docs/09-Configuration reference/01-Settings.md
Adds Ledger v3 Mirror docs and Settings reference: deployment architecture (headless Service, StatefulSet, PVCs, mirror Job), settings format (<v3-image-tag>:ledger1,ledger2,...), v3 config keys (replicas/cluster-id, PVC sizes/storageClass, Pebble and Raft tuning), and odd-quorum note.
CLI tooling: kubectl-stacks additions
tools/kubectl-stacks/*.go
Adds apiextensions.go, create.go, enable_module.go, and wires new Cobra subcommands (create, enable-module) into the root command; introduces an unstructured negotiator and CRD discovery/creation helpers.
Build / packaging updates
.gitignore, Justfile, helm/*/.helmignore
Ignore patterns for archives/dist added; new install-kubectl-stacks Justfile recipe to build kubectl-stacks; Helm .helmignore updated to exclude ./*.tgz.

Sequence Diagram

sequenceDiagram
    participant User as User/Controller
    participant Init as Ledger Reconciler\n(init.go)
    participant V3 as V3 Reconciler\n(v3.go)
    participant K8s as Kubernetes API
    participant DB as Postgres (v2 DB)

    User->>Init: Trigger reconciliation
    Init->>Init: Parse Stack version & settings
    alt v3 path
        Init->>V3: Delegate to reconcileV3(settings)
        V3->>K8s: Resolve image/config & create Gateway
        K8s-->>V3: Gateway created
        V3->>K8s: Create/Update headless Service (ledger-raft)
        K8s-->>V3: Service applied
        V3->>K8s: Create/Update StatefulSet (PVCs, pod template, probes)
        K8s-->>V3: StatefulSet applied
        alt mirror provisioning configured
            V3->>DB: Read v2 DB connection info
            DB-->>V3: Postgres env/DSN
            V3->>K8s: Create Job to provision mirror ledgers
            K8s-->>V3: Job created
        end
        V3-->>Init: v3 reconcile result
    else v2 path
        Init->>K8s: Proceed with existing v2 reconciliation
        K8s-->>Init: v2 resources reconciled
    end
    Init-->>User: Reconciliation complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through configs, tags in paw,
Rafted peers in tidy law;
PVC burrows, probes that sing,
A mirror job to make new things.
Nibble, stitch — v3 wakes with a hop.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 36.36% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding v3 reconciler with Raft StatefulSet support, which is the primary focus of the changeset.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering v3 reconciler implementation, resource architecture, settings, and version gating.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/ledger-v3-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Document all v3-specific Settings keys in both:
- Settings reference table (01-Settings.md)
- Ledger module page (03-Ledger.md) with architecture overview,
  YAML examples, and tables for persistence/Pebble/Raft tunables

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/resources/ledgers/init.go`:
- Around line 44-47: The early return when isV3 is true skips v2 cleanup; before
calling reconcileV3(ctx, stack, ledger, version) invoke a v2-cleanup routine
(e.g., cleanupV2Resources or similar) that explicitly deletes v2-only objects:
the old v2 StatefulSet(s), Services, the legacy "ledger" and "ledger-worker"
workloads, and the reindex CronJob; place this call immediately before the
return in the isV3 branch so reconcileV3 can assume v2 resources are removed
(use existing stack/ledger helpers or add a small helper function to perform the
deletes).

In `@internal/resources/ledgers/v3.go`:
- Around line 270-273: The addresses for ADVERTISE_ADDR and BOOTSTRAP_ADDR
currently hardcode the cluster domain (".cluster.local"); change them to not
include the fixed domain and instead either append only the ".svc" suffix or
read a configurable env var (e.g., CLUSTER_DOMAIN or KUBE_DNS_DOMAIN) and use
that when building the hostnames. Update the two format strings that construct
ADVERTISE_ADDR and BOOTSTRAP_ADDR (which use headlessSvc, POD_NAME,
POD_NAMESPACE, v3PortRaft, v3PortGRPC) to omit ".cluster.local" or use the env
var, and apply the same fix to other occurrences (e.g., the similar code in
internal/resources/ledgers/reindex.go).
- Around line 106-139: The StatefulSet reconcile currently rebuilds and sets
spec.VolumeClaimTemplates every run (via buildV3VolumeClaimTemplates and the
core.CreateOrUpdate for appsv1.StatefulSet), but Kubernetes treats
volumeClaimTemplates as immutable and will reject updates; remove volume claim
template reconciliation or make persistence settings create-only: stop updating
spec.VolumeClaimTemplates in the CreateOrUpdate callback (leave existing
templates untouched if the StatefulSet already exists) and/or stop watching
ledger.v3.persistence in internal/resources/ledgers/init.go so changes to
ledger.v3.persistence are treated as create-time-only, or alternatively
implement explicit PVC resize/migration logic outside the StatefulSet
reconcilation flow if you need to change storage after creation.
- Around line 308-327: The code currently uses resource.MustParse(sizeStr) when
building the PersistentVolumeClaim in the function that reads sizeStr from
settings.GetStringOrDefault; replace this with resource.ParseQuantity(sizeStr),
check the returned error, and return that error (or a wrapped validation error)
instead of allowing a panic, so the PVC construction
(corev1.PersistentVolumeClaim / corev1.PersistentVolumeClaimSpec) only proceeds
when the quantity is valid.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0e0d012e-99af-42d6-9146-d5029484c003

📥 Commits

Reviewing files that changed from the base of the PR and between fafb30a and 9ca91be.

📒 Files selected for processing (2)
  • internal/resources/ledgers/init.go
  • internal/resources/ledgers/v3.go

Comment on lines +44 to +47
isV3 := !semver.IsValid(version) || semver.Compare(version, "v3.0.0-alpha") >= 0
if isV3 {
return reconcileV3(ctx, stack, ledger, version)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clean up v2-only resources before handing off to the v3 reconciler.

This early return skips every v2-side delete path. On an in-place v2→v3 upgrade, reconcileV3 only creates the new StatefulSet/Services, so old resources like the v2 ledger / ledger-worker workloads and the reindex CronJob can survive the version flip and keep running.

Suggested direction
 isV3 := !semver.IsValid(version) || semver.Compare(version, "v3.0.0-alpha") >= 0
 if isV3 {
+	if err := cleanupV2Resources(ctx, stack, ledger); err != nil {
+		return err
+	}
 	return reconcileV3(ctx, stack, ledger, version)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/init.go` around lines 44 - 47, The early return
when isV3 is true skips v2 cleanup; before calling reconcileV3(ctx, stack,
ledger, version) invoke a v2-cleanup routine (e.g., cleanupV2Resources or
similar) that explicitly deletes v2-only objects: the old v2 StatefulSet(s),
Services, the legacy "ledger" and "ledger-worker" workloads, and the reindex
CronJob; place this call immediately before the return in the isV3 branch so
reconcileV3 can assume v2 resources are removed (use existing stack/ledger
helpers or add a small helper function to perform the deletes).

Comment on lines +270 to +273
fmt.Sprintf(`ADVERTISE_ADDR="${POD_NAME}.%s.${POD_NAMESPACE}.svc.cluster.local:%d"`, headlessSvc, v3PortRaft),
// First pod (ordinal 0) bootstraps if no checkpoint exists yet, otherwise normal start.
// Other pods join pod-0.
fmt.Sprintf(`BOOTSTRAP_ADDR="ledger-0.%s.${POD_NAMESPACE}.svc.cluster.local:%d"`, headlessSvc, v3PortGRPC),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, examine the actual code file and surrounding context
cat -n internal/resources/ledgers/v3.go | sed -n '260,280p'

Repository: formancehq/operator

Length of output: 1174


🏁 Script executed:

# Search for related DNS/domain configurations in the codebase
rg -i "cluster\.local|cluster-domain|cluster.domain" --type go

Repository: formancehq/operator

Length of output: 1079


🏁 Script executed:

# Check if there are any existing configurable domain patterns
rg -i "domain|dns|hostname" internal/resources/ledgers/ -A 2 -B 2 --type go | head -50

Repository: formancehq/operator

Length of output: 518


🌐 Web query:

Kubernetes kubelet --cluster-domain custom DNS suffix configuration

💡 Result:

The Kubernetes kubelet supports configuring a custom DNS cluster domain suffix using the --cluster-domain command-line flag or, preferably, the clusterDomain field in the KubeletConfiguration file. Command-line flag (deprecated but functional): kubelet --cluster-domain=your.custom.domain This configures the kubelet to append the specified domain to container DNS search paths, enabling short names like my-svc.my-ns to resolve as my-svc.my-ns.your.custom.domain. Default is typically cluster.local in most distributions. Recommended configuration file method (use --config=/path/to/config.yaml): apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration clusterDomain: your.custom.domain clusterDNS: - 10.96.0.10 # Cluster DNS IP, e.g., CoreDNS service IP The kubelet generates /etc/resolv.conf in pods with search domains including the pod namespace, svc, and clusterDomain (e.g., search myns.svc.your.custom.domain svc.your.custom.domain your.custom.domain). With kubeadm, include KubeletConfiguration in the init/join config file, or set ClusterConfiguration.networking.dnsDomain which propagates to kubelet. Restart kubelet after changes. This affects pod/service DNS resolution cluster-wide; all kubelets must use the same value for consistency.

Citations:


Don't hardcode cluster.local into Raft peer addresses.

Kubernetes allows operators to configure a custom DNS cluster domain via the kubelet --cluster-domain flag (default is cluster.local). Building ADVERTISE_ADDR and BOOTSTRAP_ADDR as pod.svc.cluster.local will fail peer discovery on clusters using non-default domains. Make the cluster domain configurable or use namespace-local service names that don't rely on the domain suffix.

Note: This pattern is repeated elsewhere in the codebase (e.g., internal/resources/ledgers/reindex.go) and should be addressed consistently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 270 - 273, The addresses for
ADVERTISE_ADDR and BOOTSTRAP_ADDR currently hardcode the cluster domain
(".cluster.local"); change them to not include the fixed domain and instead
either append only the ".svc" suffix or read a configurable env var (e.g.,
CLUSTER_DOMAIN or KUBE_DNS_DOMAIN) and use that when building the hostnames.
Update the two format strings that construct ADVERTISE_ADDR and BOOTSTRAP_ADDR
(which use headlessSvc, POD_NAME, POD_NAMESPACE, v3PortRaft, v3PortGRPC) to omit
".cluster.local" or use the env var, and apply the same fix to other occurrences
(e.g., the similar code in internal/resources/ledgers/reindex.go).

Comment on lines +308 to +327
sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
if err != nil {
return nil, err
}

storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
if err != nil {
return nil, err
}

pvc := corev1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: s.name,
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{corev1.ReadWriteOnce},
Resources: corev1.VolumeResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceStorage: resource.MustParse(sizeStr),
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /tmp && find . -type f -name "v3.go" -path "*/ledgers/*" 2>/dev/null | head -20

Repository: formancehq/operator

Length of output: 1882


🏁 Script executed:

find . -type f -name "v3.go" 2>/dev/null | head -20

Repository: formancehq/operator

Length of output: 97


🏁 Script executed:

find . -type f -path "*/resources/ledgers/v3.go" 2>/dev/null

Repository: formancehq/operator

Length of output: 97


🏁 Script executed:

git ls-files | grep -E "ledgers.*v3\.go"

Repository: formancehq/operator

Length of output: 95


🏁 Script executed:

git ls-files | grep -E "v3\.go" | head -10

Repository: formancehq/operator

Length of output: 95


🏁 Script executed:

sed -n '308,327p' internal/resources/ledgers/v3.go

Repository: formancehq/operator

Length of output: 733


🏁 Script executed:

sed -n '300,330p' internal/resources/ledgers/v3.go

Repository: formancehq/operator

Length of output: 1159


🌐 Web query:

k8s.io resource.MustParse panic behavior vs ParseQuantity

💡 Result:

In the k8s.io/apimachinery/pkg/api/resource package, MustParse(str string) Quantity parses the given string into a Quantity and panics if parsing fails (e.g., invalid format or numeric issues). It explicitly calls ParseQuantity internally and panics on any non-nil error with a message like "cannot parse '%v': %v". In contrast, ParseQuantity(str string) (Quantity, error) performs the same parsing but returns an error (such as ErrFormatWrong for invalid format or ErrNumeric for unable to parse numeric part) instead of panicking, allowing the caller to handle failures gracefully. Use MustParse only when the input is guaranteed valid (e.g., tests or hardcoded values). Use ParseQuantity for untrusted input to avoid panics. Note: Recent issues (as of late 2025) show edge cases near math.MaxInt64 where MustParse could produce invalid quantities without panicking, but this is a parsing bug being fixed, not a change in panic behavior for invalid inputs.

Citations:


🌐 Web query:

kubernetes apimachinery resource.MustParse documentation

💡 Result:

resource.MustParse is a function in the Kubernetes apimachinery package at k8s.io/apimachinery/pkg/api/resource. It parses a string into a Quantity and panics on error, intended for cases where the input is known to be valid, such as tests. Signature: func MustParse(str string) Quantity It calls ParseQuantity internally and panics with fmt.Errorf("cannot parse '%v': %v", str, err) if parsing fails. Quantity represents fixed-point numbers for resources like CPU and memory, supporting formats: BinarySI (e.g., "5Gi"), DecimalSI (e.g., "5G"), DecimalExponent (e.g., "12e6"). Example: memorySize := resource.MustParse("5Gi") fmt.Printf("memorySize = %v (%v)\n", memorySize.Value, memorySize.Format) Usage in resource specs: Resources: v1.ResourceRequirements{ Limits: v1.ResourceList{ "cpu": resource.MustParse("1"), "memory": resource.MustParse("1Gi"), }, Requests: v1.ResourceList{ "cpu": resource.MustParse("500m"), "memory": resource.MustParse("512Mi"), }, },

Citations:


🏁 Script executed:

grep -n "MustParse" internal/resources/ledgers/v3.go

Repository: formancehq/operator

Length of output: 125


🏁 Script executed:

grep -n "resource\." internal/resources/ledgers/v3.go | head -20

Repository: formancehq/operator

Length of output: 125


🏁 Script executed:

head -50 internal/resources/ledgers/v3.go

Repository: formancehq/operator

Length of output: 1540


Replace resource.MustParse with resource.ParseQuantity to avoid panics on invalid PVC size settings.

The sizeStr value comes from user-controlled settings via GetStringOrDefault, which means a typo in ledger.v3.persistence.*.size will cause the reconciler to panic instead of returning a validation error. Use resource.ParseQuantity to handle invalid input gracefully.

Safer parsing
 	for _, s := range specs {
 		sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
 		if err != nil {
 			return nil, err
 		}
+		sizeQty, err := resource.ParseQuantity(sizeStr)
+		if err != nil {
+			return nil, fmt.Errorf("invalid %s value %q: %w", s.sizeKey, sizeStr, err)
+		}
 
 		storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
 		if err != nil {
 			return nil, err
 		}
@@
 			Spec: corev1.PersistentVolumeClaimSpec{
 				AccessModes: []corev1.PersistentVolumeAccessMode{corev1.ReadWriteOnce},
 				Resources: corev1.VolumeResourceRequirements{
 					Requests: corev1.ResourceList{
-						corev1.ResourceStorage: resource.MustParse(sizeStr),
+						corev1.ResourceStorage: sizeQty,
 					},
 				},
 			},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
if err != nil {
return nil, err
}
storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
if err != nil {
return nil, err
}
pvc := corev1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: s.name,
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{corev1.ReadWriteOnce},
Resources: corev1.VolumeResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceStorage: resource.MustParse(sizeStr),
},
sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
if err != nil {
return nil, err
}
sizeQty, err := resource.ParseQuantity(sizeStr)
if err != nil {
return nil, fmt.Errorf("invalid %s value %q: %w", s.sizeKey, sizeStr, err)
}
storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
if err != nil {
return nil, err
}
pvc := corev1.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: s.name,
},
Spec: corev1.PersistentVolumeClaimSpec{
AccessModes: []corev1.PersistentVolumeAccessMode{corev1.ReadWriteOnce},
Resources: corev1.VolumeResourceRequirements{
Requests: corev1.ResourceList{
corev1.ResourceStorage: sizeQty,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 308 - 327, The code currently
uses resource.MustParse(sizeStr) when building the PersistentVolumeClaim in the
function that reads sizeStr from settings.GetStringOrDefault; replace this with
resource.ParseQuantity(sizeStr), check the returned error, and return that error
(or a wrapped validation error) instead of allowing a panic, so the PVC
construction (corev1.PersistentVolumeClaim / corev1.PersistentVolumeClaimSpec)
only proceeds when the quantity is valid.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/04-Modules/03-Ledger.md`:
- Around line 104-110: Update the "Ledger v3" section to clarify that the global
"Requirements" above apply to v2 (not v3) by adding a short note or adjusting
headings; specifically modify the "Ledger v3" heading/content to include a
sentence like "Note: the top-level requirements apply to Ledger v2 — Ledger v3
uses embedded Pebble storage and does not require PostgreSQL or a message
broker." Ensure you reference the "Ledger v3" section and the existing
"Requirements" wording so readers won't misconfigure installs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 22a15665-0d3d-4966-a6ee-1cff0073e5a3

📥 Commits

Reviewing files that changed from the base of the PR and between 9ca91be and dd00278.

📒 Files selected for processing (2)
  • docs/04-Modules/03-Ledger.md
  • docs/09-Configuration reference/01-Settings.md
✅ Files skipped from review due to trivial changes (1)
  • docs/09-Configuration reference/01-Settings.md

- Add 20 Ginkgo integration tests for v3 StatefulSet reconciler
- Prefix all v3 settings with "module." for consistency
- Fix version gate to use semver.Major == "v3" (avoids breaking v2 tests)
- Remove redundant services.Create (GatewayHTTPAPI handles ClusterIP service)
- Update docs with correct settings key paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
internal/resources/ledgers/v3.go (3)

300-318: ⚠️ Potential issue | 🟠 Major

Avoid resource.MustParse on Settings input.

sizeStr is user-controlled here. A typo like 20GG will panic this reconcile path instead of returning a validation error.

🛡️ Safer quantity parsing
 	sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
 	if err != nil {
 		return nil, err
 	}
+	sizeQty, err := resource.ParseQuantity(sizeStr)
+	if err != nil {
+		return nil, fmt.Errorf("invalid %s value %q: %w", s.sizeKey, sizeStr, err)
+	}
 
 	storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
 	if err != nil {
 		return nil, err
 	}
@@
 				Resources: corev1.VolumeResourceRequirements{
 					Requests: corev1.ResourceList{
-						corev1.ResourceStorage: resource.MustParse(sizeStr),
+						corev1.ResourceStorage: sizeQty,
 					},
 				},
 			},
Does k8s.io/apimachinery/pkg/api/resource.MustParse panic on invalid quantities?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 300 - 318, The code currently
calls resource.MustParse(sizeStr) when building the PersistentVolumeClaim ( PVC
creation using s.name and the sizeStr obtained via settings.GetStringOrDefault),
which will panic on invalid user input; replace MustParse with
resource.ParseQuantity and handle the error: call
resource.ParseQuantity(sizeStr), return a clear validation/error from the
reconcile function if parsing fails (with context including
s.sizeKey/stackName), and only use the returned Quantity in the PVC Spec when
parsing succeeds.

111-126: ⚠️ Potential issue | 🟠 Major

volumeClaimTemplates can't be treated as a live-reconciled setting.

internal/resources/ledgers/init.go watches Settings changes, and this callback rewrites t.Spec.VolumeClaimTemplates on every reconcile. Kubernetes rejects StatefulSet updates that mutate spec.volumeClaimTemplates, so changing module.ledger.v3.persistence.* after the first create will wedge reconciliation. Either make these settings create-time-only or manage PVC expansion outside the StatefulSet template.

Can Kubernetes StatefulSet spec.volumeClaimTemplates be changed after the StatefulSet is created?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 111 - 126, The reconcile
currently overwrites t.Spec.VolumeClaimTemplates inside the CreateOrUpdate
callback for the StatefulSet (see the anonymous func(t *appsv1.StatefulSet) in
CreateOrUpdate[*appsv1.StatefulSet]), which mutates spec.volumeClaimTemplates
and will be rejected by Kubernetes on updates; change the logic so
VolumeClaimTemplates are only set at creation time (or omitted from the Update
path): detect whether t.ObjectMeta.Generation or t.Spec.VolumeClaimTemplates is
already present and skip replacing t.Spec.VolumeClaimTemplates on updates, or
move PVC size/expansion handling out of the StatefulSet template into separate
PVC patch/expand code; also adjust the Settings watcher in
internal/resources/ledgers/init.go to treat module.ledger.v3.persistence.* as
create-time-only or trigger your separate PVC expansion flow instead of
reconciling the StatefulSet template.

262-265: ⚠️ Potential issue | 🟠 Major

Don't hardcode .cluster.local into peer addresses.

Clusters can run with a non-default DNS suffix, so these FQDNs break Raft discovery outside cluster.local. Make the cluster domain configurable or derive addresses without baking in the suffix.

Can Kubernetes clusters use a DNS cluster domain other than cluster.local?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 262 - 265, The peer FQDNs are
hardcoded with ".cluster.local" which breaks clusters using a different DNS
suffix; change the ADVERTISE_ADDR and BOOTSTRAP_ADDR string templates to not
embed ".cluster.local" and instead append a cluster domain variable (e.g., use
an env var like CLUSTER_DOMAIN or the Kubernetes-provided
KUBERNETES_SERVICE_DNS_DOMAIN with a sensible fallback) when building the
addresses; update the templates that reference ADVERTISE_ADDR and BOOTSTRAP_ADDR
(using headlessSvc, v3PortRaft, v3PortGRPC, POD_NAME and POD_NAMESPACE) to
include the cluster domain variable rather than the literal ".cluster.local" so
the domain is configurable at runtime.
internal/resources/ledgers/init.go (1)

44-46: ⚠️ Potential issue | 🟠 Major

Clean up v2 resources before taking the v3 fast-path.

This early return still skips the existing v2 delete path, so an in-place v2→v3 upgrade can leave the old Deployment, worker, and reindex CronJob running beside the new StatefulSet.

🧹 Suggested direction
 isV3 := semver.IsValid(version) && semver.Major(version) == "v3"
 if isV3 {
+	if err := cleanupV2Resources(ctx, stack, ledger); err != nil {
+		return err
+	}
 	return reconcileV3(ctx, stack, ledger, version)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/init.go` around lines 44 - 46, The early return
for the v3 fast-path (the isV3 check using semver.IsValid/semver.Major that
currently calls reconcileV3 and returns) skips the existing v2 delete path and
leaves v2 resources running; before returning from the v3 branch, invoke the v2
cleanup/delete logic (the existing v2 delete path that removes the v2
Deployment, worker, and reindex CronJob) so v2 resources are removed during an
in-place v2→v3 upgrade—i.e., call or reuse the v2 cleanup function/logic
immediately before or inside the isV3 branch and only then proceed to return
reconcileV3(ctx, stack, ledger, version).
🧹 Nitpick comments (2)
internal/tests/ledger_v3_controller_test.go (2)

163-169: Assert the generated ledger Service here too.

reconcileV3 now relies on the GatewayHTTPAPI reconciler to materialize the ClusterIP Service, so only checking the CR leaves the 8080 → targetPort http / container port 9000 compatibility path untested.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tests/ledger_v3_controller_test.go` around lines 163 - 169, The test
currently only asserts the GatewayHTTPAPI CR but not the generated ClusterIP
Service; update the test after loading the GatewayHTTPAPI (httpAPI) to also load
the Service resource named core.GetObjectName(stack.Name, "ledger") into a
v1.Service struct and assert that its spec.ports contains a port 8080 that
targets port named "http" (or targetPort 9000) so the 8080→targetPort mapping
created by the GatewayHTTPAPI reconciler (reconcileV3 path) is verified; use the
same Eventually/LoadResource pattern (or Expect) as used for httpAPI and
reference types GatewayHTTPAPI, reconcileV3, and the Service object to locate
where to add the assertion.

186-204: Add the even-replica rejection case alongside this happy path.

installV3StatefulSet explicitly rejects even replica counts, but this suite never proves that validation. A module.ledger.v3.replicas=4 case would lock in the behavior promised by the PR.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tests/ledger_v3_controller_test.go` around lines 186 - 204, Add a
parallel negative test that verifies even replica counts are rejected: create a
new Context using the same replicasSetting variable but set value "4" for the
key "module.ledger.v3.replicas" and in JustBeforeEach attempt to
Create(replicasSetting) and assert it returns an error (or that no StatefulSet
named "ledger" is ever created via LoadResource). Reference the existing symbols
replicasSetting, Create, Delete, LoadResource and the installV3StatefulSet
behavior to locate where to add an It block that expects creation to fail (or
that *sts.Spec.Replicas is not set) when the value is even.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/tests/ledger_v3_controller_test.go`:
- Around line 171-175: Modify the "Should NOT create a Database object" test to
first wait for a positive v3 signal (e.g., assert existence of the StatefulSet
or the headless Service created by the controller) before asserting Database
absence: replace the immediate Consistently call that uses LoadResource("",
core.GetObjectName(stack.Name, "ledger"), &v1beta1.Database{}) with a two-step
approach—1) Block until the StatefulSet or headless Service is observed (use the
test helper that GETs the StatefulSet/Service by name and expects it to
Eventually(Succeed)), then 2) run the Consistently loop that calls
LoadResource(..., &v1beta1.Database{}) and ShouldNot(Succeed()) to ensure the
Database remains absent for the short window; keep references to LoadResource,
core.GetObjectName(stack.Name, "ledger"), v1beta1.Database and the test "Should
NOT create a Database object" so the change is localized.

---

Duplicate comments:
In `@internal/resources/ledgers/init.go`:
- Around line 44-46: The early return for the v3 fast-path (the isV3 check using
semver.IsValid/semver.Major that currently calls reconcileV3 and returns) skips
the existing v2 delete path and leaves v2 resources running; before returning
from the v3 branch, invoke the v2 cleanup/delete logic (the existing v2 delete
path that removes the v2 Deployment, worker, and reindex CronJob) so v2
resources are removed during an in-place v2→v3 upgrade—i.e., call or reuse the
v2 cleanup function/logic immediately before or inside the isV3 branch and only
then proceed to return reconcileV3(ctx, stack, ledger, version).

In `@internal/resources/ledgers/v3.go`:
- Around line 300-318: The code currently calls resource.MustParse(sizeStr) when
building the PersistentVolumeClaim ( PVC creation using s.name and the sizeStr
obtained via settings.GetStringOrDefault), which will panic on invalid user
input; replace MustParse with resource.ParseQuantity and handle the error: call
resource.ParseQuantity(sizeStr), return a clear validation/error from the
reconcile function if parsing fails (with context including
s.sizeKey/stackName), and only use the returned Quantity in the PVC Spec when
parsing succeeds.
- Around line 111-126: The reconcile currently overwrites
t.Spec.VolumeClaimTemplates inside the CreateOrUpdate callback for the
StatefulSet (see the anonymous func(t *appsv1.StatefulSet) in
CreateOrUpdate[*appsv1.StatefulSet]), which mutates spec.volumeClaimTemplates
and will be rejected by Kubernetes on updates; change the logic so
VolumeClaimTemplates are only set at creation time (or omitted from the Update
path): detect whether t.ObjectMeta.Generation or t.Spec.VolumeClaimTemplates is
already present and skip replacing t.Spec.VolumeClaimTemplates on updates, or
move PVC size/expansion handling out of the StatefulSet template into separate
PVC patch/expand code; also adjust the Settings watcher in
internal/resources/ledgers/init.go to treat module.ledger.v3.persistence.* as
create-time-only or trigger your separate PVC expansion flow instead of
reconciling the StatefulSet template.
- Around line 262-265: The peer FQDNs are hardcoded with ".cluster.local" which
breaks clusters using a different DNS suffix; change the ADVERTISE_ADDR and
BOOTSTRAP_ADDR string templates to not embed ".cluster.local" and instead append
a cluster domain variable (e.g., use an env var like CLUSTER_DOMAIN or the
Kubernetes-provided KUBERNETES_SERVICE_DNS_DOMAIN with a sensible fallback) when
building the addresses; update the templates that reference ADVERTISE_ADDR and
BOOTSTRAP_ADDR (using headlessSvc, v3PortRaft, v3PortGRPC, POD_NAME and
POD_NAMESPACE) to include the cluster domain variable rather than the literal
".cluster.local" so the domain is configurable at runtime.

---

Nitpick comments:
In `@internal/tests/ledger_v3_controller_test.go`:
- Around line 163-169: The test currently only asserts the GatewayHTTPAPI CR but
not the generated ClusterIP Service; update the test after loading the
GatewayHTTPAPI (httpAPI) to also load the Service resource named
core.GetObjectName(stack.Name, "ledger") into a v1.Service struct and assert
that its spec.ports contains a port 8080 that targets port named "http" (or
targetPort 9000) so the 8080→targetPort mapping created by the GatewayHTTPAPI
reconciler (reconcileV3 path) is verified; use the same Eventually/LoadResource
pattern (or Expect) as used for httpAPI and reference types GatewayHTTPAPI,
reconcileV3, and the Service object to locate where to add the assertion.
- Around line 186-204: Add a parallel negative test that verifies even replica
counts are rejected: create a new Context using the same replicasSetting
variable but set value "4" for the key "module.ledger.v3.replicas" and in
JustBeforeEach attempt to Create(replicasSetting) and assert it returns an error
(or that no StatefulSet named "ledger" is ever created via LoadResource).
Reference the existing symbols replicasSetting, Create, Delete, LoadResource and
the installV3StatefulSet behavior to locate where to add an It block that
expects creation to fail (or that *sts.Spec.Replicas is not set) when the value
is even.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 37140e16-8592-48e3-a693-5ce75bf95db7

📥 Commits

Reviewing files that changed from the base of the PR and between dd00278 and dd2cca4.

⛔ Files ignored due to path filters (1)
  • config/rbac/role.yaml is excluded by !**/*.yaml
📒 Files selected for processing (5)
  • docs/04-Modules/03-Ledger.md
  • docs/09-Configuration reference/01-Settings.md
  • internal/resources/ledgers/init.go
  • internal/resources/ledgers/v3.go
  • internal/tests/ledger_v3_controller_test.go
✅ Files skipped from review due to trivial changes (2)
  • docs/04-Modules/03-Ledger.md
  • docs/09-Configuration reference/01-Settings.md

Comment on lines +171 to +175
It("Should NOT create a Database object", func() {
Consistently(func() error {
return LoadResource("", core.GetObjectName(stack.Name, "ledger"), &v1beta1.Database{})
}).ShouldNot(Succeed())
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wait for a positive v3 signal before asserting Database absence.

This can pass before the controller has finished its first reconcile. First wait for the StatefulSet or headless Service to exist, then keep asserting that the Database object stays absent for a short window.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tests/ledger_v3_controller_test.go` around lines 171 - 175, Modify
the "Should NOT create a Database object" test to first wait for a positive v3
signal (e.g., assert existence of the StatefulSet or the headless Service
created by the controller) before asserting Database absence: replace the
immediate Consistently call that uses LoadResource("",
core.GetObjectName(stack.Name, "ledger"), &v1beta1.Database{}) with a two-step
approach—1) Block until the StatefulSet or headless Service is observed (use the
test helper that GETs the StatefulSet/Service by name and expects it to
Eventually(Succeed)), then 2) run the Consistently loop that calls
LoadResource(..., &v1beta1.Database{}) and ShouldNot(Succeed()) to ensure the
Database remains absent for the short window; keep references to LoadResource,
core.GetObjectName(stack.Name, "ledger"), v1beta1.Database and the test "Should
NOT create a Database object" so the change is localized.

On scale-down, the preStop lifecycle hook:
1. Calls POST /_admin/deregister to remove the node from the Raft cluster
2. Cleans the WAL directory so future re-joins start as fresh learners

This ensures clean Raft membership management during StatefulSet scaling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
internal/resources/ledgers/v3.go (3)

323-342: ⚠️ Potential issue | 🟠 Major

Replace resource.MustParse for setting-derived PVC sizes.

sizeStr comes from Settings, so invalid values can panic the reconciler here. Use resource.ParseQuantity and return a validation error instead.

Safer quantity parsing
 	for _, s := range specs {
 		sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
 		if err != nil {
 			return nil, err
 		}
+		sizeQty, err := resource.ParseQuantity(sizeStr)
+		if err != nil {
+			return nil, fmt.Errorf("invalid %s value %q: %w", s.sizeKey, sizeStr, err)
+		}
@@
 				Resources: corev1.VolumeResourceRequirements{
 					Requests: corev1.ResourceList{
-						corev1.ResourceStorage: resource.MustParse(sizeStr),
+						corev1.ResourceStorage: sizeQty,
 					},
 				},
 			},
 		}
#!/bin/bash
set -euo pipefail

# Show setting-derived size flow and parse call site
rg -n 'GetStringOrDefault\(.*persistence.*size|MustParse\(|ParseQuantity\(' internal/resources/ledgers/v3.go -C3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 323 - 342, Replace the call to
resource.MustParse(sizeStr) inside the PVC creation (the code around sizeStr and
the corev1.PersistentVolumeClaimSpec in v3.go) with safe parsing: call
resource.ParseQuantity(sizeStr), handle the returned (q, err) and if err != nil
return a validation error (wrap or return err) instead of panicking; ensure the
parsed Quantity is used to populate the Requests map for corev1.ResourceStorage
so reconciler never panics on invalid setting-derived sizes.

269-273: ⚠️ Potential issue | 🟠 Major

Do not hardcode cluster.local in peer addresses.

Using fixed .svc.cluster.local breaks discovery on clusters with a custom DNS domain. Prefer namespace-local .svc addresses or make cluster domain configurable.

Safer address construction
-		fmt.Sprintf(`ADVERTISE_ADDR="${POD_NAME}.%s.${POD_NAMESPACE}.svc.cluster.local:%d"`, headlessSvc, v3PortRaft),
+		fmt.Sprintf(`ADVERTISE_ADDR="${POD_NAME}.%s.${POD_NAMESPACE}.svc:%d"`, headlessSvc, v3PortRaft),
 ...
-		fmt.Sprintf(`BOOTSTRAP_ADDR="ledger-0.%s.${POD_NAMESPACE}.svc.cluster.local:%d"`, headlessSvc, v3PortGRPC),
+		fmt.Sprintf(`BOOTSTRAP_ADDR="ledger-0.%s.${POD_NAMESPACE}.svc:%d"`, headlessSvc, v3PortGRPC),
#!/bin/bash
set -euo pipefail
rg -n 'cluster\.local' internal/resources/ledgers/v3.go -C2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 269 - 273, The peer address
strings currently hardcode ".svc.cluster.local" when building ADVERTISE_ADDR and
BOOTSTRAP_ADDR; change them to avoid the fixed cluster domain by either using
the namespace-local ".svc" suffix or reading a configurable CLUSTER_DOMAIN env
var. Update the fmt.Sprintf calls that build ADVERTISE_ADDR and BOOTSTRAP_ADDR
(the lines that reference headlessSvc, v3PortRaft and v3PortGRPC) to produce
"%s.${POD_NAMESPACE}.svc:%d" or "%s.${POD_NAMESPACE}.svc.%s:%d" driven by
CLUSTER_DOMAIN, and ensure any new env var is read early so the same variable is
used for both ADVERTISE_ADDR and BOOTSTRAP_ADDR.

111-127: ⚠️ Potential issue | 🟠 Major

Avoid reconciling volumeClaimTemplates after StatefulSet creation.

VolumeClaimTemplates is effectively immutable for normal StatefulSet updates. Rebuilding it from Settings in this mutate block can cause forbidden updates once the StatefulSet exists.

#!/bin/bash
set -euo pipefail

# Verify StatefulSet mutate path is always assigning VolumeClaimTemplates
rg -n 'CreateOrUpdate\[\*appsv1.StatefulSet\]|VolumeClaimTemplates|buildV3VolumeClaimTemplates' internal/resources/ledgers/v3.go -C3

# Verify persistence settings are consumed by v3 reconciler
rg -n 'module\.ledger\.v3\.persistence|buildV3VolumeClaimTemplates' internal/resources/ledgers/v3.go -C2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 111 - 127, The mutate block
currently unconditionally overwrites t.Spec.VolumeClaimTemplates causing
forbidden updates; modify the closure passed to
CreateOrUpdate[*appsv1.StatefulSet] so that you only assign VolumeClaimTemplates
when the StatefulSet is being created or has no existing templates (e.g., check
t.ObjectMeta.CreationTimestamp.IsZero() or len(t.Spec.VolumeClaimTemplates) ==
0) and otherwise leave t.Spec.VolumeClaimTemplates untouched; continue to use
buildV3VolumeClaimTemplates to construct the templates but only set them
conditionally inside the function literal for t *appsv1.StatefulSet.
internal/tests/ledger_v3_controller_test.go (1)

185-189: ⚠️ Potential issue | 🟡 Minor

Wait for a positive v3 signal before asserting Database absence.

This assertion can pass before reconciliation starts. First wait for StatefulSet/headless Service creation, then run Consistently on Database absence.

Suggested test stabilization
 It("Should NOT create a Database object", func() {
+	sts := &appsv1.StatefulSet{}
+	Eventually(func() error {
+		return LoadResource(stack.Name, "ledger", sts)
+	}).Should(Succeed())
+
 	Consistently(func() error {
 		return LoadResource("", core.GetObjectName(stack.Name, "ledger"), &v1beta1.Database{})
 	}).ShouldNot(Succeed())
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tests/ledger_v3_controller_test.go` around lines 185 - 189, The test
currently asserts Database absence immediately which can race with
reconciliation; modify the test ("Should NOT create a Database object") to first
wait for the positive v3 signal by polling for the created StatefulSet and
headless Service (e.g., use LoadResource/Expect/Eventually against
core.GetObjectName(stack.Name, "<statefulset-name>") and the headless Service)
and only after those resources exist run the Consistently check calling
LoadResource(..., &v1beta1.Database{}) to assert it does not appear; keep the
same resource-name construction via core.GetObjectName and the existing
Consistently block but gate it behind an initial Eventually/Wait for the
StatefulSet and headless Service to stabilize.
🧹 Nitpick comments (1)
internal/tests/ledger_v3_controller_test.go (1)

200-218: Add a negative-path test for even replica settings.

Current coverage proves custom odd replicas, but not enforcement failure for even values. Add a module.ledger.v3.replicas=4 case and assert reconcile does not produce the v3 StatefulSet.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/tests/ledger_v3_controller_test.go` around lines 200 - 218, Add a
negative-path test that ensures even replica values are rejected: create a new
Context similar to the existing "with custom replicas setting" but set
replicasSetting = settings.New(uuid.NewString(), "module.ledger.v3.replicas",
"4", stack.Name), call Create(replicasSetting) in JustBeforeEach and
Delete(replicasSetting) in AfterEach, then assert reconcile does not produce the
v3 StatefulSet by calling LoadResource(stack.Name, "ledger", sts) and expecting
it to return an error or for Eventually(func(g Gomega) error { return
LoadResource(stack.Name, "ledger", sts) }).ShouldNot(Succeed()) (or equivalent
negative assertion) so the test verifies no StatefulSet is created for an even
replica value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/resources/ledgers/v3.go`:
- Around line 94-96: The current validation only enforces oddness and allows
non-positive values like -1; update the guard around the replicas variable to
first check replicas > 0 and return a clear error (e.g.,
"module.ledger.v3.replicas must be positive, got %d") before checking oddness,
so the function handling replica validation (the block using the replicas
variable) fails fast with an explicit positive-count error and then retains the
existing oddness check.

---

Duplicate comments:
In `@internal/resources/ledgers/v3.go`:
- Around line 323-342: Replace the call to resource.MustParse(sizeStr) inside
the PVC creation (the code around sizeStr and the
corev1.PersistentVolumeClaimSpec in v3.go) with safe parsing: call
resource.ParseQuantity(sizeStr), handle the returned (q, err) and if err != nil
return a validation error (wrap or return err) instead of panicking; ensure the
parsed Quantity is used to populate the Requests map for corev1.ResourceStorage
so reconciler never panics on invalid setting-derived sizes.
- Around line 269-273: The peer address strings currently hardcode
".svc.cluster.local" when building ADVERTISE_ADDR and BOOTSTRAP_ADDR; change
them to avoid the fixed cluster domain by either using the namespace-local
".svc" suffix or reading a configurable CLUSTER_DOMAIN env var. Update the
fmt.Sprintf calls that build ADVERTISE_ADDR and BOOTSTRAP_ADDR (the lines that
reference headlessSvc, v3PortRaft and v3PortGRPC) to produce
"%s.${POD_NAMESPACE}.svc:%d" or "%s.${POD_NAMESPACE}.svc.%s:%d" driven by
CLUSTER_DOMAIN, and ensure any new env var is read early so the same variable is
used for both ADVERTISE_ADDR and BOOTSTRAP_ADDR.
- Around line 111-127: The mutate block currently unconditionally overwrites
t.Spec.VolumeClaimTemplates causing forbidden updates; modify the closure passed
to CreateOrUpdate[*appsv1.StatefulSet] so that you only assign
VolumeClaimTemplates when the StatefulSet is being created or has no existing
templates (e.g., check t.ObjectMeta.CreationTimestamp.IsZero() or
len(t.Spec.VolumeClaimTemplates) == 0) and otherwise leave
t.Spec.VolumeClaimTemplates untouched; continue to use
buildV3VolumeClaimTemplates to construct the templates but only set them
conditionally inside the function literal for t *appsv1.StatefulSet.

In `@internal/tests/ledger_v3_controller_test.go`:
- Around line 185-189: The test currently asserts Database absence immediately
which can race with reconciliation; modify the test ("Should NOT create a
Database object") to first wait for the positive v3 signal by polling for the
created StatefulSet and headless Service (e.g., use
LoadResource/Expect/Eventually against core.GetObjectName(stack.Name,
"<statefulset-name>") and the headless Service) and only after those resources
exist run the Consistently check calling LoadResource(..., &v1beta1.Database{})
to assert it does not appear; keep the same resource-name construction via
core.GetObjectName and the existing Consistently block but gate it behind an
initial Eventually/Wait for the StatefulSet and headless Service to stabilize.

---

Nitpick comments:
In `@internal/tests/ledger_v3_controller_test.go`:
- Around line 200-218: Add a negative-path test that ensures even replica values
are rejected: create a new Context similar to the existing "with custom replicas
setting" but set replicasSetting = settings.New(uuid.NewString(),
"module.ledger.v3.replicas", "4", stack.Name), call Create(replicasSetting) in
JustBeforeEach and Delete(replicasSetting) in AfterEach, then assert reconcile
does not produce the v3 StatefulSet by calling LoadResource(stack.Name,
"ledger", sts) and expecting it to return an error or for Eventually(func(g
Gomega) error { return LoadResource(stack.Name, "ledger", sts)
}).ShouldNot(Succeed()) (or equivalent negative assertion) so the test verifies
no StatefulSet is created for an even replica value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ba146dd-79d7-439f-ad46-58146905853a

📥 Commits

Reviewing files that changed from the base of the PR and between dd2cca4 and 24a53f1.

📒 Files selected for processing (2)
  • internal/resources/ledgers/v3.go
  • internal/tests/ledger_v3_controller_test.go

Comment on lines +94 to +96
if replicas%2 == 0 {
return fmt.Errorf("module.ledger.v3.replicas must be odd, got %d", replicas)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject non-positive replica counts explicitly.

Odd-only validation allows invalid negative values (e.g., -1). Enforce replicas > 0 together with oddness to fail fast with a clear message.

Proposed guard
-	if replicas%2 == 0 {
-		return fmt.Errorf("module.ledger.v3.replicas must be odd, got %d", replicas)
+	if replicas <= 0 || replicas%2 == 0 {
+		return fmt.Errorf("module.ledger.v3.replicas must be a positive odd number, got %d", replicas)
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if replicas%2 == 0 {
return fmt.Errorf("module.ledger.v3.replicas must be odd, got %d", replicas)
}
if replicas <= 0 || replicas%2 == 0 {
return fmt.Errorf("module.ledger.v3.replicas must be a positive odd number, got %d", replicas)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 94 - 96, The current
validation only enforces oddness and allows non-positive values like -1; update
the guard around the replicas variable to first check replicas > 0 and return a
clear error (e.g., "module.ledger.v3.replicas must be positive, got %d") before
checking oddness, so the function handling replica validation (the block using
the replicas variable) fails fast with an explicit positive-count error and then
retains the existing oddness check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gfyrag and others added 2 commits March 31, 2026 12:01
Add a `create <stack-name>` command to kubectl-stacks and a `just install-kubectl-stacks` recipe to build and install the plugin into $GOPATH/bin.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (3)
internal/resources/ledgers/v3.go (3)

343-346: ⚠️ Potential issue | 🟠 Major

Use resource.ParseQuantity instead of resource.MustParse to avoid panics on invalid user input.

The sizeStr value comes from user-controlled settings. A typo like "10G1" will cause the reconciler to panic instead of returning a validation error.

Proposed fix
 	for _, s := range specs {
 		sizeStr, err := settings.GetStringOrDefault(ctx, stackName, s.defaultSize, strings.Split(s.sizeKey, ".")...)
 		if err != nil {
 			return nil, err
 		}
+		sizeQty, err := resource.ParseQuantity(sizeStr)
+		if err != nil {
+			return nil, fmt.Errorf("invalid %s value %q: %w", s.sizeKey, sizeStr, err)
+		}

 		storageClass, err := settings.GetStringOrEmpty(ctx, stackName, strings.Split(s.storageClassKey, ".")...)
 		if err != nil {
 			return nil, err
 		}

 		pvc := corev1.PersistentVolumeClaim{
 			// ...
 			Spec: corev1.PersistentVolumeClaimSpec{
 				// ...
 				Resources: corev1.VolumeResourceRequirements{
 					Requests: corev1.ResourceList{
-						corev1.ResourceStorage: resource.MustParse(sizeStr),
+						corev1.ResourceStorage: sizeQty,
 					},
 				},
 			},
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 343 - 346, The code uses
resource.MustParse(sizeStr) when setting corev1.ResourceStorage in the Resources
-> Requests block (corev1.VolumeResourceRequirements / corev1.ResourceList),
which can panic on invalid user input; replace MustParse with
resource.ParseQuantity(sizeStr), check the returned (quantity, err), and if err
!= nil return or surface a validation error from the reconciler (or the
surrounding function) instead of panicking so invalid sizes like "10G1" produce
a proper error path.

98-100: ⚠️ Potential issue | 🟠 Major

Reject non-positive replica counts explicitly.

The current check replicas%2 == 0 allows negative values (e.g., -1 % 2 = -1 in Go, which is not 0). Add an explicit positive check.

Proposed fix
-	if replicas%2 == 0 {
-		return fmt.Errorf("module.ledger.v3.replicas must be odd, got %d", replicas)
+	if replicas <= 0 || replicas%2 == 0 {
+		return fmt.Errorf("module.ledger.v3.replicas must be a positive odd number, got %d", replicas)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 98 - 100, The current check in
the ledger validation that only rejects even replica counts (using replicas%2 ==
0) misses negative or zero values; update the validation around the replicas
variable (in internal/resources/ledgers/v3.go where replicas is validated) to
first ensure replicas is positive (replicas > 0) and then ensure it is odd,
returning a clear fmt.Errorf for non-positive values (e.g.,
"module.ledger.v3.replicas must be positive, got %d") and keep the existing
oddness error (or combine checks into a single descriptive error message).

273-276: ⚠️ Potential issue | 🟠 Major

Hardcoded cluster.local DNS suffix may break on non-default cluster domains.

Kubernetes allows custom cluster domains via --cluster-domain. These addresses will fail peer discovery on such clusters. Consider using namespace-local names (without the full domain suffix) or making the domain configurable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/v3.go` around lines 273 - 276, The ADVERTISE_ADDR
and BOOTSTRAP_ADDR format strings hardcode "svc.cluster.local" which breaks
non-default cluster domains; update those lines (the fmt.Sprintf calls that
build ADVERTISE_ADDR and BOOTSTRAP_ADDR) to avoid the hardcoded suffix by either
using namespace-local DNS (e.g. use ".${POD_NAMESPACE}.svc:%d" instead of
".${POD_NAMESPACE}.svc.cluster.local:%d") or read a configurable cluster domain
env var (e.g. CLUSTER_DNS_SUFFIX with default "svc.cluster.local") and append
it, so the fmt.Sprintf calls reference the new suffix variable rather than the
fixed "svc.cluster.local".
🧹 Nitpick comments (1)
tools/kubectl-stacks/enable_module.go (1)

49-57: Derive module apiVersion from discovered CRD instead of hardcoding v1beta1.

Line 120 hardcodes v1beta1.GroupVersion, but the CRD discovery already retrieves group and version information. Currently all module CRDs use v1beta1, but this hardcoding can become stale if served versions change in the future.

Extract the group and version during discovery to construct the apiVersion dynamically. The CRD structure supports this—all module CRDs expose spec.group and spec.versions[].name with served/storage flags:

Refactor sketch
 type moduleCRD struct {
 	Kind   string
 	Plural string
+	APIVersion string
 }

 var crdList struct {
 	Items []struct {
 		Spec struct {
+			Group string `json:"group"`
 			Names struct {
 				Kind   string `json:"kind"`
 				Plural string `json:"plural"`
 			} `json:"names"`
+			Versions []struct {
+				Name    string `json:"name"`
+				Served  bool   `json:"served"`
+				Storage bool   `json:"storage"`
+			} `json:"versions"`
 		} `json:"spec"`
 	} `json:"items"`
 }

 for _, item := range crdList.Items {
+	ver := ""
+	for _, v := range item.Spec.Versions {
+		if v.Storage || (ver == "" && v.Served) {
+			ver = v.Name
+		}
+	}
 	modules = append(modules, moduleCRD{
 		Kind:   item.Spec.Names.Kind,
 		Plural: item.Spec.Names.Plural,
+		APIVersion: item.Spec.Group + "/" + ver,
 	})
 }

 // ...
 "apiVersion": mod.APIVersion,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/kubectl-stacks/enable_module.go` around lines 49 - 57, The code
currently hardcodes v1beta1.GroupVersion when building the module apiVersion;
instead, read the group and version discovered in the CRD discovery struct (the
anonymous crdList Items -> Spec fields) and construct the apiVersion dynamically
(e.g., "<group>/<version>") rather than using v1beta1.GroupVersion. Locate where
v1beta1.GroupVersion is referenced (the apiVersion construction around the
module creation at/near the use of v1beta1.GroupVersion) and replace it by
extracting spec.group and selecting the appropriate spec.versions[].name (use
the served and/or storage flags to prefer the served/storage version) from the
discovered CRD item, then build the apiVersion from those values and use that
string wherever the hardcoded v1beta1.GroupVersion was used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/04-Modules/03-Ledger.md`:
- Line 131: The config key documented at "modules.ledger.v3-mirror" is
inconsistent with other v3 keys which use the "module.ledger.v3.*" prefix;
update the docs (and any corresponding config examples) to use the singular
"module" prefix so the key becomes "module.ledger.v3-mirror" (or alternatively
change all other v3 keys to "modules" if code is authoritative), and ensure both
occurrences (the one at the reported line and the second occurrence at lines
~170) are adjusted to match the chosen consistent prefix.
- Around line 117-119: The fenced code block showing
"<v3-image-tag>:<ledger1>,<ledger2>..." must include a language specifier to
satisfy markdownlint MD040; update the triple-backtick fence to "```text" (or
"```plaintext") so the block becomes ```text followed by
<v3-image-tag>:<ledger1>,<ledger2>... and the closing ``` to ensure the code
block is annotated.

In `@internal/resources/ledgers/init.go`:
- Line 117: The setting key used when reading v3 mirror is inconsistent: the
call to settings.GetString(ctx, stack.Name, "modules", "ledger", "v3-mirror")
uses the "modules" prefix while other v3 settings use the "module" prefix (see
v3.go patterns); change the key to use the same prefix (e.g., "module",
"ledger", "v3-mirror") to align with the rest of the v3 settings, or if you
intend to keep a deliberate separation between feature toggles and core config,
add a comment explaining the distinction and update any related reads/writes to
match that convention (update usage around v3MirrorSetting and any tests/configs
that assume the key).

In `@Justfile`:
- Around line 102-103: The install-kubectl-stacks target may fail if
${GOPATH}/bin doesn't exist; update the install-kubectl-stacks recipe (the block
that does `cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env
GOPATH`)}}/bin/kubectl-stacks .`) to first create the bin directory (e.g., run
mkdir -p on the resolved {{env('GOPATH', `go env GOPATH`)}}/bin) before invoking
go build so the output path exists.

In `@tools/kubectl-stacks/create.go`:
- Around line 33-35: The client POST is using the wrong resource name case:
change the call to client.Post().Resource("Stacks").Body(stack) so that Resource
uses the CRD's exact plural lowercase name; replace "Stacks" with "stacks"
(locate the call in tools/kubectl-stacks/create.go where the POST is constructed
and update the Resource(...) argument).

---

Duplicate comments:
In `@internal/resources/ledgers/v3.go`:
- Around line 343-346: The code uses resource.MustParse(sizeStr) when setting
corev1.ResourceStorage in the Resources -> Requests block
(corev1.VolumeResourceRequirements / corev1.ResourceList), which can panic on
invalid user input; replace MustParse with resource.ParseQuantity(sizeStr),
check the returned (quantity, err), and if err != nil return or surface a
validation error from the reconciler (or the surrounding function) instead of
panicking so invalid sizes like "10G1" produce a proper error path.
- Around line 98-100: The current check in the ledger validation that only
rejects even replica counts (using replicas%2 == 0) misses negative or zero
values; update the validation around the replicas variable (in
internal/resources/ledgers/v3.go where replicas is validated) to first ensure
replicas is positive (replicas > 0) and then ensure it is odd, returning a clear
fmt.Errorf for non-positive values (e.g., "module.ledger.v3.replicas must be
positive, got %d") and keep the existing oddness error (or combine checks into a
single descriptive error message).
- Around line 273-276: The ADVERTISE_ADDR and BOOTSTRAP_ADDR format strings
hardcode "svc.cluster.local" which breaks non-default cluster domains; update
those lines (the fmt.Sprintf calls that build ADVERTISE_ADDR and BOOTSTRAP_ADDR)
to avoid the hardcoded suffix by either using namespace-local DNS (e.g. use
".${POD_NAMESPACE}.svc:%d" instead of ".${POD_NAMESPACE}.svc.cluster.local:%d")
or read a configurable cluster domain env var (e.g. CLUSTER_DNS_SUFFIX with
default "svc.cluster.local") and append it, so the fmt.Sprintf calls reference
the new suffix variable rather than the fixed "svc.cluster.local".

---

Nitpick comments:
In `@tools/kubectl-stacks/enable_module.go`:
- Around line 49-57: The code currently hardcodes v1beta1.GroupVersion when
building the module apiVersion; instead, read the group and version discovered
in the CRD discovery struct (the anonymous crdList Items -> Spec fields) and
construct the apiVersion dynamically (e.g., "<group>/<version>") rather than
using v1beta1.GroupVersion. Locate where v1beta1.GroupVersion is referenced (the
apiVersion construction around the module creation at/near the use of
v1beta1.GroupVersion) and replace it by extracting spec.group and selecting the
appropriate spec.versions[].name (use the served and/or storage flags to prefer
the served/storage version) from the discovered CRD item, then build the
apiVersion from those values and use that string wherever the hardcoded
v1beta1.GroupVersion was used.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a469327b-f81d-4d97-a617-9bc29ac74400

📥 Commits

Reviewing files that changed from the base of the PR and between 24a53f1 and 8a63607.

⛔ Files ignored due to path filters (1)
  • helm/operator/templates/gen/rbac.authorization.k8s.io_v1_clusterrole_formance-manager-role.yaml is excluded by !**/gen/**, !**/*.yaml, !**/gen/**
📒 Files selected for processing (11)
  • .gitignore
  • Justfile
  • docs/04-Modules/03-Ledger.md
  • helm/crds/.helmignore
  • helm/operator/.helmignore
  • internal/resources/ledgers/init.go
  • internal/resources/ledgers/v3.go
  • tools/kubectl-stacks/apiextensions.go
  • tools/kubectl-stacks/create.go
  • tools/kubectl-stacks/enable_module.go
  • tools/kubectl-stacks/main.go
✅ Files skipped from review due to trivial changes (3)
  • .gitignore
  • helm/crds/.helmignore
  • helm/operator/.helmignore

Comment on lines +117 to +119
```
<v3-image-tag>:<ledger1>,<ledger2>,...
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language specifier to fenced code block.

The fenced code block should have a language specified per markdownlint MD040. Use text or plaintext for format descriptions.

-```
+```text
 <v3-image-tag>:<ledger1>,<ledger2>,...
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 117-117: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/04-Modules/03-Ledger.md` around lines 117 - 119, The fenced code block
showing "<v3-image-tag>:<ledger1>,<ledger2>..." must include a language
specifier to satisfy markdownlint MD040; update the triple-backtick fence to
"```text" (or "```plaintext") so the block becomes ```text followed by
<v3-image-tag>:<ledger1>,<ledger2>... and the closing ``` to ensure the code
block is annotated.


// If v3 mirror is configured, deploy the v3 StatefulSet and provision mirror ledgers.
// The setting value is "image:ledger1,ledger2,..." (e.g. "v3.0.0-alpha.1:default,payments").
v3MirrorSetting, err := settings.GetString(ctx, stack.Name, "modules", "ledger", "v3-mirror")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check setting key prefixes in v3-related code
rg -n "module.*ledger.*v3" internal/resources/ledgers/ --type go

Repository: formancehq/operator

Length of output: 1183


🏁 Script executed:

# Search for all "modules" prefix usage in settings
rg -n '"modules"' internal/resources/ledgers/ --type go -B 2 -A 2

Repository: formancehq/operator

Length of output: 565


🏁 Script executed:

# Search for the parseV3MirrorSetting function to understand the expected format
rg -n "parseV3MirrorSetting" internal/resources/ledgers/init.go --type go -A 15

Repository: formancehq/operator

Length of output: 1019


🏁 Script executed:

# Search for broader usage of "modules" prefix in settings
rg -n 'settings\.Get.*"modules"' --type go

Repository: formancehq/operator

Length of output: 377


🏁 Script executed:

# Check if there are any other places where "modules" prefix is used (not just GetString)
rg -n '"modules"' --type go -C 1

Repository: formancehq/operator

Length of output: 919


🏁 Script executed:

# Check if there are more v3 settings in v3.go that might give context
rg -n 'GetString.*"module".*"v3' internal/resources/ledgers/v3.go --type go

Repository: formancehq/operator

Length of output: 266


🏁 Script executed:

# Look at the broader context around the v3-mirror setting to understand the intent
sed -n '110,160p' internal/resources/ledgers/init.go

Repository: formancehq/operator

Length of output: 1530


🏁 Script executed:

# Check if there's any documentation about the setting naming conventions
find . -type f \( -name "*.md" -o -name "*.txt" -o -name "*.rst" \) | xargs grep -l "modules\|module" 2>/dev/null | head -5

Repository: formancehq/operator

Length of output: 180


🏁 Script executed:

# Check the git history or comments around the v3-mirror setting introduction
git log --oneline -n 20 -- internal/resources/ledgers/init.go | head -10

Repository: formancehq/operator

Length of output: 89


🏁 Script executed:

# Check the ledger documentation for setting naming patterns
grep -n "modules\|module" docs/04-Modules/03-Ledger.md | head -20

Repository: formancehq/operator

Length of output: 1651


🏁 Script executed:

# Check if there's any context in the Stack documentation
grep -n "modules\|module" docs/04-Modules/01-Stack.md | head -20

Repository: formancehq/operator

Length of output: 576


Setting key prefix inconsistency: modules vs module.

The v3-mirror setting uses modules.ledger.v3-mirror (plural), while other v3 settings in v3.go use module.ledger.v3.* (singular). This inconsistency is documented in the official ledger module documentation, suggesting it may be an intentional design choice (feature toggles under modules, core configuration under module). However, aligning all v3 settings under a single prefix would improve consistency and reduce confusion for users configuring their stacks.

Consider standardizing the naming convention across all v3 settings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/resources/ledgers/init.go` at line 117, The setting key used when
reading v3 mirror is inconsistent: the call to settings.GetString(ctx,
stack.Name, "modules", "ledger", "v3-mirror") uses the "modules" prefix while
other v3 settings use the "module" prefix (see v3.go patterns); change the key
to use the same prefix (e.g., "module", "ledger", "v3-mirror") to align with the
rest of the v3 settings, or if you intend to keep a deliberate separation
between feature toggles and core config, add a comment explaining the
distinction and update any related reads/writes to match that convention (update
usage around v3MirrorSetting and any tests/configs that assume the key).

Comment on lines +102 to +103
install-kubectl-stacks:
cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env GOPATH`)}}/bin/kubectl-stacks .
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Ensure ${GOPATH}/bin exists before writing the binary.

Line 103 can fail on clean environments where ${GOPATH}/bin is missing.

Proposed fix
 install-kubectl-stacks:
-  cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env GOPATH`)}}/bin/kubectl-stacks .
+  mkdir -p {{env('GOPATH', `go env GOPATH`)}}/bin
+  cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env GOPATH`)}}/bin/kubectl-stacks .
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
install-kubectl-stacks:
cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env GOPATH`)}}/bin/kubectl-stacks .
install-kubectl-stacks:
mkdir -p {{env('GOPATH', `go env GOPATH`)}}/bin
cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env GOPATH`)}}/bin/kubectl-stacks .
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Justfile` around lines 102 - 103, The install-kubectl-stacks target may fail
if ${GOPATH}/bin doesn't exist; update the install-kubectl-stacks recipe (the
block that does `cd tools/kubectl-stacks && go build -o {{env('GOPATH', `go env
GOPATH`)}}/bin/kubectl-stacks .`) to first create the bin directory (e.g., run
mkdir -p on the resolved {{env('GOPATH', `go env GOPATH`)}}/bin) before invoking
go build so the output path exists.

Comment on lines +33 to +35
return client.Post().
Resource("Stacks").
Body(stack).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify create command resource path usage
rg -n 'Resource\("Stacks"\)|Resource\("stacks"\)' tools/kubectl-stacks/create.go

# Verify Stack CRD plural name
rg -n -C2 'name:\s*stacks\.formance\.com|plural:\s*stacks' config/crd/bases/formance.com_stacks.yaml

Repository: formancehq/operator

Length of output: 333


Change Resource("Stacks") to Resource("stacks") to match the CRD plural name.

The Kubernetes API requires the resource name to match the CRD's plural form exactly, which is lowercase stacks as defined in the stack CRD. Using the incorrect case will result in API calls hitting a non-existent endpoint and failing.

Proposed fix
 return client.Post().
-		Resource("Stacks").
+		Resource("stacks").
 		Body(stack).
 		Do(cmd.Context()).
 		Error()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return client.Post().
Resource("Stacks").
Body(stack).
return client.Post().
Resource("stacks").
Body(stack).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/kubectl-stacks/create.go` around lines 33 - 35, The client POST is
using the wrong resource name case: change the call to
client.Post().Resource("Stacks").Body(stack) so that Resource uses the CRD's
exact plural lowercase name; replace "Stacks" with "stacks" (locate the call in
tools/kubectl-stacks/create.go where the POST is constructed and update the
Resource(...) argument).

gfyrag and others added 2 commits April 1, 2026 10:32
Tests were written for a standalone v3 mode that doesn't exist. The v3
deployment is triggered via the modules.ledger.v3-mirror setting as a
mirror sidecar, not as a version-based dispatch. Fix CLUSTER_ID
expectation, health endpoint, database creation, image name, and add
mirror provisioning job test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants