Claude Code Guidelines for bloud-v3

Critical: Access URLs & Port Architecture

Production / ISO: Users access Bloud at http://bloud.local (port 80). Native NixOS Traefik service binds directly to port 80 via CAP_NET_BIND_SERVICE. No iptables redirect needed.

Local dev (NixOS dev-server): Access via http://localhost (port 80, Traefik). The dev-server NixOS config runs Traefik on port 80.

Service worker is registered on the Traefik port
Iframe content is served through Traefik
Everything is same-origin
NEVER access Vite directly on port 5173

The architecture is: Browser → port 80 → Traefik → Vite/Apps.

Debugging Principles

THIS IS NON-NEGOTIABLE. Do not skip these steps.

Always Gather Evidence First

Before proposing any fix or making claims about root causes:

Gather actual evidence by running commands, adding logs, and observing output
Explain what evidence was gathered and what it shows
Walk through the reasoning step by step
Only then propose changes, with clear justification tied to the evidence

Never Guess - Theory Without Data is Worthless

Do not propose changes based on assumptions or theories
Do not claim to know the cause without evidence
If asked "why is this needed?", have concrete evidence ready
A plausible-sounding theory is NOT evidence

Anti-Pattern: Theorizing Without Data

NEVER do this:

"The issue is probably X because Y could happen" → proposes code change

ALWAYS do this:

"I suspect X. Let me add logging to verify" → gathers data → shows output →
confirms/refutes theory with evidence → THEN proposes fix

Real example of what NOT to do:

User reports 404 errors on /api/v3/* requests
BAD: "The issue is the SW update clears the clientAppMap" → proposes fix
GOOD: "Let me add debug logging to see what clientId and clientApp values are" → observes: before SW update clientApp='radarr', after update clientApp=null → "Evidence confirms the SW update clears the map" → proposes fix

Explain Before Executing

When debugging:

State what you're checking and why
Run the command or add the logging
Explain what the output means
Then decide next steps

Project Structure

Main Entry Point

nixos/bloud.nix - The primary module for local testing with rootless podman

App Modules

Located in apps/<name>/ with each app having:

metadata.yaml - App catalog info (name, description, integrations, etc.)
module.nix - NixOS module for the app
configurator.go - Go configurator for runtime integrations

Helper Library

nixos/lib/podman-service.nix - Creates systemd user services for podman containers
nixos/lib/authentik-blueprint.nix - Generates Authentik OAuth2 blueprints

Rootless Podman Notes

Service States

Services can be in failed state from previous runs
inactive/dead means not started, not necessarily broken
Check journalctl --user -u <service> for actual errors

Debugging Steps

Check service status: systemctl --user list-units 'podman-*.service' --all
Check logs: journalctl --user -u podman-<name>.service
Check container state: podman ps -a
Check from container's UID perspective: podman unshare ls -la <path>

Common Issues

Stale data with wrong permissions from previous runs
Services staying in failed state after cleanup (need manual restart or rebuild)
UID mapping: host user maps to root inside container with rootless podman

UID Mapping Details

With rootless podman, UIDs are remapped:

Host UID 1000 (daniel) → Container UID 0 (root)
Container UID 1000 → Host UID 100999 (from subuid range)

Problem: Containers running as non-root users (e.g., Authentik runs as UID 1000) can't write to directories owned by the host user.

Solution: Use --userns=keep-id which maps Host UID 1000 → Container UID 1000 (preserves UID).

Cleanup with Container-Owned Files

Files created by containers may be owned by mapped UIDs that the host user can't delete.

Solution: Use podman unshare rm -rf <path> to delete from the container's UID namespace.

Dependency Management

systemd Dependencies

after + wants = ordering only, doesn't wait for health
requires = hard dependency, service fails if dependency fails
For oneshot services with RemainAfterExit=true, dependent services wait for completion

Health Checks

The mkPodmanService helper supports:

waitFor - list of {container, command} to health check before starting
extraAfter / extraRequires - additional systemd dependencies

Example:

mkPodmanService {
  name = "my-app";
  waitFor = [
    { container = "postgres"; command = "pg_isready -U user"; }
    { container = "redis"; command = "redis-cli ping"; }
  ];
  extraAfter = [ "my-init.service" ];
  extraRequires = [ "my-init.service" ];
}

Architecture Decisions

Shared Resource Architecture

Design Principle: Each Bloud host runs a maximum of one instance of each core infrastructure service:

1 PostgreSQL instance per host - All apps requiring PostgreSQL share this single instance
1 Redis instance per host - All apps requiring Redis share this single instance (currently used by Authentik)
1 Restic instance per host - Single backup service for all app data (not yet implemented)

Benefits:

Resource efficiency: Lower RAM and CPU usage vs. per-app instances
Simplified operations: One service to monitor, backup, and maintain
Better performance: Shared connection pooling and caching
Data consistency: Single source of truth

Implementation:

Apps connect via environment variables to shared services
NixOS modules ensure only one instance is created per host
Service dependencies ensure apps wait for shared infrastructure

Embedded App Routing Architecture

CRITICAL CONSTRAINT: No app-specific routes at root level.

All embedded apps MUST be served under /embed/{appName}/ paths. URL rewriting via service worker handles apps that use absolute paths.

See docs/embedded-app-routing.md for full details.

Pre-Built Artifacts (No vendorHash/npmDepsHash)

Go and npm are built outside the Nix sandbox using their native toolchains. Nix only packages the pre-built artifacts into the ISO.

Why: Nix's sandbox blocks network access during builds. buildGoModule and buildNpmPackage work around this with fixed-output derivations that require pre-declared hashes (vendorHash, npmDepsHash). These hashes break on every dependency change. Since Go builds are already reproducible (pinned by go.sum + Go version) and npm builds by package-lock.json, building inside the sandbox adds no meaningful reproducibility — only fragility.

How it works:

CI builds the Go binary and frontend with native toolchains (see .github/workflows/build-iso.yml)
Artifacts are placed in build/host-agent and build/frontend/
nixos/packages/host-agent.nix packages them into a Nix derivation (just file copying, no compilation)
If artifacts don't exist, a stub derivation is used so nix flake check still passes

Local ISO builds require building the artifacts first:

mkdir -p build
cd services/host-agent && CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o ../../build/host-agent ./cmd/host-agent
cd ../.. && npm ci && npm run build --workspace=services/host-agent/web
cp -r services/host-agent/web/build build/frontend
git add -f build/   # Nix flakes only see git-tracked files
nix build .#packages.x86_64-linux.iso

Local dev is unaffected — ./bloud start uses go-watch/Vite directly, never touches Nix package builds.

nixos-rebuild Store Flake Gotchas

Three issues arise when BLOUD_FLAKE_PATH points to a bundled store path (e.g. /nix/store/<hash>-bloud-host-agent-0.1.0/share/bloud):

1. Re-exec step (_NIXOS_REBUILD_REEXEC=1) nixos-rebuild builds $flake#...nixos-rebuild and re-execs from it before switching. Setting _NIXOS_REBUILD_REEXEC=1 skips this optimization step (it's safe to skip). Must be passed inline via sudo env since sudo strips env vars.

2. Nix treats store paths as already-built (path: prefix) nix build /nix/store/hash/subdir#attr treats the whole URI as a store path reference, returning the STORE ROOT (/nix/store/hash) directly without evaluating the flake at all. This causes switch-to-configuration to be looked up in the host-agent package (which doesn't have it).

Fix: Use path:/nix/store/.../share/bloud — the path: URI scheme forces Nix to evaluate the flake.nix properly. See flakeURI() in services/host-agent/internal/nixgen/rebuild.go.

3. App module.nix files not in store bloud.nix imports ../apps/*/module.nix to load app NixOS modules. The original package only copied metadata.yaml and icon.png from each app. Without module.nix, bloud.apps.* options are undefined, causing eval failure.

Fix: nixos/packages/host-agent.nix now copies module.nix alongside metadata.

4. Host-agent package defaults to stub from store When packages/host-agent.nix is evaluated from the store, ../../build doesn't exist, so hasPrebuilt=false and the stub is used. The stub fails to build, blocking nix build system.build.toplevel.

Fix: Detect when running from a deployed store path (binary exists 4 dirs up at ../../../../bin/host-agent) and use builtins.storePath to reference the already-deployed package without rebuilding.

systemd services run with a stripped PATH that excludes /run/wrappers/bin (sudo) and /run/current-system/sw/bin (nixos-rebuild, systemctl, etc.). Always use absolute paths in any code that runs inside a systemd service:

sudo → /run/wrappers/bin/sudo
nixos-rebuild → /run/current-system/sw/bin/nixos-rebuild
systemctl → /run/current-system/sw/bin/systemctl

The host-agent API (localhost:3000) uses session cookie auth. Requests from 127.0.0.1 (loopback) bypass auth automatically — shell access to the machine implies CLI trust. This is how ./bloud install works: it SSHes into the VM and curls localhost:3000 directly.

External requests (through Traefik or from the browser) still require a valid session cookie.

Local Development

The ./bloud CLI has two modes, detected automatically:

Native NixOS mode (default) — runs dev services directly on a NixOS machine with hot reload
Proxmox mode (when BLOUD_PVE_HOST is set) — deploys the ISO to a Proxmox host for integration testing

Prerequisites

Requires a NixOS machine (physical or VM) with the dev-server flake configuration applied.

npm run setup    # Installs deps + builds ./bloud CLI
./bloud setup    # Checks prerequisites and applies NixOS configuration

The `./bloud` CLI

Native NixOS mode (development):

./bloud start          # Start dev environment
./bloud stop           # Stop dev services
./bloud status         # Show dev environment status
./bloud logs           # Show logs from dev services
./bloud attach         # Attach to tmux session (Ctrl-B D to detach)
./bloud shell [cmd]    # Run a command (or open a shell)
./bloud rebuild        # Rebuild NixOS configuration

Proxmox mode (ISO integration testing, requires BLOUD_PVE_HOST):

./bloud start [iso]          # Deploy ISO → create VM → boot → check (VM stays running)
./bloud start --skip-deploy  # Reuse existing VM, re-run checks
./bloud stop                 # Stop VM
./bloud destroy              # Destroy VM
./bloud status               # Show VM and service status
./bloud logs                 # Stream VM journalctl
./bloud shell [cmd]          # SSH into VM
./bloud checks               # Run health checks against running VM
./bloud install <app>        # Install app via API
./bloud uninstall <app>      # Uninstall app via API

ISO Integration Testing

Set BLOUD_PVE_HOST in your .env file or environment, then:

./bloud start          # Test latest GitHub release (VM stays running after checks)
./bloud start ./bloud.iso  # Test a local ISO
./bloud shell          # SSH into the running VM
./bloud logs           # Stream journalctl output
./bloud checks         # Re-run health checks against a running VM
./bloud install <app>  # Install an app on the running VM
./bloud destroy        # Tear down the VM when done

BLOUD_PVE_HOST can be set in a .env file at the project root — the CLI loads it automatically:

BLOUD_PVE_HOST=root@10.0.0.165

After Changing NixOS Config

If you modify .nix files (like adding new apps):

./bloud rebuild   # Apply NixOS changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude Code Guidelines for bloud-v3

Critical: Access URLs & Port Architecture

Debugging Principles

Always Gather Evidence First

Never Guess - Theory Without Data is Worthless

Anti-Pattern: Theorizing Without Data

Explain Before Executing

Project Structure

Main Entry Point

App Modules

Helper Library

Rootless Podman Notes

Service States

Debugging Steps

Common Issues

UID Mapping Details

Cleanup with Container-Owned Files

Dependency Management

systemd Dependencies

Health Checks

Architecture Decisions

Shared Resource Architecture

Embedded App Routing Architecture

Pre-Built Artifacts (No vendorHash/npmDepsHash)

nixos-rebuild Store Flake Gotchas

Local Development

Prerequisites

The `./bloud` CLI

ISO Integration Testing

After Changing NixOS Config

FilesExpand file tree

claude.md

Latest commit

History

claude.md

File metadata and controls

Claude Code Guidelines for bloud-v3

Critical: Access URLs & Port Architecture

Debugging Principles

Always Gather Evidence First

Never Guess - Theory Without Data is Worthless

Anti-Pattern: Theorizing Without Data

Explain Before Executing

Project Structure

Main Entry Point

App Modules

Helper Library

Rootless Podman Notes

Service States

Debugging Steps

Common Issues

UID Mapping Details

Cleanup with Container-Owned Files

Dependency Management

systemd Dependencies

Health Checks

Architecture Decisions

Shared Resource Architecture

Embedded App Routing Architecture

Pre-Built Artifacts (No vendorHash/npmDepsHash)

nixos-rebuild Store Flake Gotchas

Local Development

Prerequisites

The ./bloud CLI

ISO Integration Testing

After Changing NixOS Config

The `./bloud` CLI