Conversation
a85750a to
6ad2bb3
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new SQLite (“TideBase”) build target plus a draft schema/spec and validation tests to ensure the SQLite output preserves the existing station dataset.
Changes:
- Introduces a SQLite build script that generates a
.tidebasedatabase from existing station JSON data. - Adds a TideBase schema/spec plus example SQL queries.
- Adds Vitest-based validation tests and a CI job to build + validate the SQLite artifact.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/sqlite/test/sqlite.test.ts | Adds database validation and example-query execution tests. |
| packages/sqlite/spec.md | Defines the TideBase SQLite specification (draft). |
| packages/sqlite/schema.sql | Adds the SQLite schema used by the build. |
| packages/sqlite/package.json | Defines sqlite workspace scripts/deps for building and testing. |
| packages/sqlite/examples/subordinate-stations.sql | Adds an example query for subordinate stations. |
| packages/sqlite/examples/station-datums.sql | Adds an example query for station datums. |
| packages/sqlite/examples/station-constituents.sql | Adds an example query for station constituents. |
| packages/sqlite/examples/prediction-data.sql | Adds an example query returning prediction inputs for a year. |
| packages/sqlite/examples/find-nearest.sql | Adds an example “nearest stations” query. |
| packages/sqlite/examples/find-commercial.sql | Adds example queries for commercially-usable licenses. |
| packages/sqlite/examples/find-by-id.sql | Adds an example station lookup query by station_id. |
| packages/sqlite/examples/find-by-country.sql | Adds example queries filtering/counting by country/continent. |
| packages/sqlite/examples/find-by-bounding-box.sql | Adds an example bounding-box spatial query. |
| packages/sqlite/build.ts | Implements the SQLite database generator (stations, constituents, offsets, datums, astro tables). |
| packages/sqlite/build | Adds a shell wrapper to create dist/ and run the build. |
| packages/sqlite/README.md | Documents TideBase usage and contribution/testing steps. |
| .github/workflows/ci.yml | Adds a CI job to build and test the sqlite package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| let db: DatabaseSync; | ||
|
|
||
| beforeAll(() => { |
There was a problem hiding this comment.
Opening the SQLite file in beforeAll will throw if the file is missing/corrupt, which prevents the earlier “Database file exists and has reasonable size” assertion from running and producing a more actionable failure. Consider moving the DB open into a hook that runs after a successful existence/size check (or do the existence check inside the hook before instantiating DatabaseSync).
| beforeAll(() => { | |
| beforeAll(() => { | |
| // Ensure the database file exists and is reasonably sized before opening. | |
| expect(existsSync(dbPath)).toBe(true); | |
| const stats = statSync(dbPath); | |
| expect(stats.size).toBeGreaterThan(1_000_000); // At least 1MB |
| let db: DatabaseSync; | ||
|
|
||
| beforeAll(() => { | ||
| db = new DatabaseSync(dbPath, { readOnly: true }); | ||
| }); |
There was a problem hiding this comment.
The DatabaseSync handle is never closed, which can keep file descriptors open and cause Vitest to report open handles in some environments. Add an afterAll(() => db.close()) (or equivalent) once all tests complete.
| }, | ||
| "devDependencies": { | ||
| "@neaps/tide-database": "file:../..", | ||
| "@neaps/tide-predictor": "*" |
There was a problem hiding this comment.
Using "*" for @neaps/tide-predictor makes installs non-deterministic and can break the build/tests when new versions are published. Prefer a pinned semver range or (if this is a monorepo/workspace dependency) workspace:*.
| "@neaps/tide-predictor": "*" | |
| "@neaps/tide-predictor": "workspace:*" |
| -- Find all stations in a country | ||
| SELECT s.station_id, s.name, s.type, s.latitude, s.longitude | ||
| FROM stations s | ||
| WHERE s.country = 'United States' |
There was a problem hiding this comment.
The TideBase spec defines stations.country as an ISO 3166-1 alpha-2 country code, but this example filters by a full country name (“United States”), which is inconsistent and likely to yield no results if the DB follows the spec. Update the example to use the expected code (e.g. US) or adjust the spec/data definition to match the intended stored values.
| WHERE s.country = 'United States' | |
| WHERE s.country = 'US' |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Reading review of TideBase Specification Version 1.0 (Draft), tide-database/packages/sqlite/spec.md General comment 1: A lot of fields are missing constraints; e.g., ensuring that amplitudes are positive. Some are constrained only in the descriptive text. General comment 2: The spec seems to be of two minds regarding whether the database "contains everything needed for tide prediction" as is stated in the abstract. Stations may require constituent definitions and prediction models that are not in the database. metadata mandatory keys station_count and constituent_count: Why should queryable counts be in metadata, let alone as mandatory metadata? constituents: "SHOULD contain all constituents referenced by any station in the database" see general comment 2. Different prediction models can yield significantly different results for what is nominally the same constituent. It's unclear whether or how the database supports different prediction models. Do you create different variants of the "same" constituent with different names as happened with TCD, or does the station record specify which prediction model applies to the harmonic constants for that station? sources.url: Mandatory URL precludes the use of old data that do not have a presence on the web today. stations.type: A point of terminological confusion that infected XTide is being repeated. Referring to https://tidesandcurrents.noaa.gov/glossary, a subordinate station can be two different things: (1) a station from which a relatively short series of observations is reduced by comparison with simultaneous observations from a station with a relatively long series of observations, or (2) a station from which predictions are to be obtained by means of differences and ratios applied to the full predictions at a reference station. Type (1) subordinate stations actually have harmonic constants; thus, it is not true that harmonic constants imply reference station as XTide documentation assumed. If only a single prediction model is in use (which is unclear, see general comment 2), the types that need to be distinguished are harmonic vs. offsets and tide vs. current. stations.continent and region: Avoid unconstrained text if a standard enumeration can be included by reference. Geography/cartography should be delegated as much as is practical. stations.license: Specifying SPDX (not optional) excludes data sources that do not choose from that list and data that have mixed licenses. stations.commercial_use: Redundant with license and a false dichotomy. The conditions can be arbitrarily complex. equilibrium_arguments and node_factors "MAY be absent for constituents that lack a computable astronomical formula" see general comment 2. equilibrium_arguments.value: Text says (0, 360) should be [0, 360). Tide prediction: Need to be clear about dimensional quantities vs. the numerical values of quantities expressed in specified units. Future considerations: If you expect to handle currents someday, it is worth considering the requirements now. Supporting major and minor axes for currents requires a different primary key for harmonic constants. (Harmbase2 and TCD also have this problem.) |
This adds a new SQLite build, along with a proposed spec for the sqlite schema called "TideBase". A single
.tidebasefile contains everything needed for tide prediction:WHEREon lat/lon columns)README | Spec
Fixes #18