Skip to content

Absolute CUBEJS_SCHEMA_PATH silently loads zero data model files (FileRepository uses path.join instead of path.resolve) #10746

@samjewell

Description

@samjewell

Describe the bug

Setting CUBEJS_SCHEMA_PATH (or schemaPath) to an absolute path silently loads zero data model files. /cubejs-api/v1/meta returns {"cubes":[]} and there are no errors, warnings, or hints in the cube logs.

The cause is in FileRepository:

https://github.com/cube-js/cube/blob/master/packages/cubejs-backend-shared/src/FileRepository.ts#L22-L35

public localPath(): string {
  return path.join(process.cwd(), this.repositoryPath);
}

protected async getFiles(dir: string, fileList: string[] = []): Promise<string[]> {
  ...
  const fullPath = path.join(this.localPath(), dir);
  await fs.ensureDir(fullPath);
  files = await fs.readdir(fullPath);

path.join (unlike path.resolve) does not treat an absolute argument as a reset point — it just concatenates. Combined with the fs.ensureDir call added in #8909, this means an absolute repositoryPath is silently rewritten to live under process.cwd() and the resulting empty directory is created on disk. No error is thrown.

> path.join('/cube/conf', '/data/model')
'/cube/conf/data/model'      // what cube actually uses
> path.resolve('/cube/conf', '/data/model')
'/data/model'                // what most users expect

Pre-#8909 (Nov 2024) this same configuration would have thrown Model files not found. Please make sure the "/data/model" directory exists and contains model files., which at least told the user something was wrong. Now it fails completely silently.

To Reproduce

Minimal repro with the official image — no cluster, no orchestrator, no real DB needed:

mkdir -p /tmp/repro/data/model
cat > /tmp/repro/data/model/orders.yml <<'YML'
cubes:
  - name: orders
    sql: "SELECT 1 AS id"
    measures:
      - { name: count, type: count }
    dimensions:
      - { name: id, sql: id, type: number, primary_key: true }
YML

docker run --rm -p 4000:4000 \
  -v /tmp/repro/data:/data \
  -e CUBEJS_DEV_MODE=true \
  -e CUBEJS_DB_TYPE=postgres \
  -e CUBEJS_API_SECRET=dev \
  -e CUBEJS_SCHEMA_PATH=/data/model \
  cubejs/cube:v1.6.39

In another terminal:

$ curl -s http://localhost:4000/cubejs-api/v1/meta
{"cubes":[]}

$ docker exec <container> ls /cube/conf/data/model
# (empty — created by fs.ensureDir, NOT the directory we mounted)

$ docker exec <container> ls /data/model
orders.yml                  # the file is right here, just never read

Swap CUBEJS_SCHEMA_PATH=/data/model for CUBEJS_SCHEMA_PATH=data/model (relative) and the cube loads correctly.

Expected behavior

Either of:

  1. Absolute paths in CUBEJS_SCHEMA_PATH / schemaPath are honoured. The one-line fix is to use path.resolve instead of path.join in FileRepository#localPath — backwards compatible, since path.resolve(cwd, relative) produces the same result as path.join(cwd, relative) for relative inputs.
  2. If absolute paths are intentionally unsupported, throw a clear configuration error at startup, and update the docs for schema_path / CUBEJS_SCHEMA_PATH to say so. Today the docs are silent on path resolution semantics.

Independently, it would be good to make FileRepository#getFiles not fs.ensureDir the configured schema root on read. Materialising an empty schema directory on disk just to then read it back as empty is what makes the current behaviour so confusing — a missing schema dir that the user explicitly configured is almost always a bug worth surfacing rather than papering over.

Version:

cubejs/cube:v1.6.39 (and same code on master as of writing).

The relevant path.join has been there since the file was introduced; the fs.ensureDir that converts the failure mode from "loud error" to "silent empty" was added in #8909 (merged Nov 2024, first shipped in v1.1.3).

Additional context

Real-world surface for this: anyone mounting their data model from a Kubernetes/Docker volume that doesn't happen to land under the image's WORKDIR=/cube/conf. In our case it's a git-sync sidecar mounting a model repo at /git/current/model/<project>; the failure mode (empty /meta with zero log output) cost a couple of hours to diagnose.

Happy to send a PR for the path.resolve change if it'd be welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions