ENG-3566: Refactor engine creation to use SQLAlchemy creator pattern#8148
Conversation
Switch all 6 database engines from URI-based connection creation to the creator pattern. The creator callable is invoked by the pool on every new connection, resolving credentials at connect time rather than engine construction time. This is the structural seam that enables dynamic credential rotation via AWS Secrets Manager in follow-up work. Changes: - Add raw_password/raw_readonly_password properties to DatabaseSettings for unescaped passwords (needed by psycopg2/asyncpg direct connect) - New engine_creators.py with make_sync_creator/make_async_creator factories and credential helpers - Add creator parameter to get_db_engine() in session.py - Switch ctl_session.py async/sync engines to creator pattern - Switch session_management.py sync engines to creator pattern - Switch tasks/__init__.py task engine to creator pattern - Update design doc to remove lazy factory requirement Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (93.68%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #8148 +/- ##
==========================================
+ Coverage 85.40% 85.45% +0.04%
==========================================
Files 649 650 +1
Lines 42283 42363 +80
Branches 4960 4971 +11
==========================================
+ Hits 36112 36201 +89
+ Misses 5063 5054 -9
Partials 1108 1108 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
25 tests covering credential helpers, raw_password round-trip, asyncpg param conversion, SSL context building, and end-to-end sync/async engine creation via creator pattern. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix: async_params.pop("ssl") when ssl_context exists, preventing
kw.update(async_params) from replacing the SSLContext with a raw string
- Add ValueError when creator and keepalives params are both passed to
get_db_engine()
- Add SSL success-path test with self-signed cert
- Add get_db_engine creator= path and keepalives error tests
- Remove unused psycopg2 import from test file
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…assword Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
adamsachs
left a comment
There was a problem hiding this comment.
nice work, this looks good to me! nice job carving out a clean abstraction/hook for this.
a couple of nits and then my claude review does have one callout that i think is worth looking into, will post that as a quick followup
| SQLAlchemy engine ``creator`` callables for dynamic credential resolution. | ||
|
|
||
| The ``creator`` pattern passes a callable to ``create_engine`` / | ||
| ``create_async_engine`` instead of a connection URI. SQLAlchemy calls the | ||
| creator every time the pool needs a new connection, so credentials are | ||
| resolved at **connection time** rather than engine construction time. | ||
|
|
||
| Today the credential helpers read from static config (``CONFIG.database``). | ||
| A future secret-provider integration will swap them to call | ||
| ``provider.get_secret()`` — the rest of the engine code stays the same. |
There was a problem hiding this comment.
nice documentation here.
nit: maybe worth mentioning that they must stay lightweight? (or at least i assume so 😄)
| def raw_password(self) -> str: | ||
| """Return password unescaped for direct driver use (psycopg2/asyncpg).""" | ||
| return unquote_plus(self.password) | ||
|
|
||
| @property | ||
| def raw_readonly_password(self) -> Optional[str]: | ||
| """Return readonly password unescaped for direct driver use.""" | ||
| if self.readonly_password: | ||
| return unquote_plus(self.readonly_password) | ||
| return None |
There was a problem hiding this comment.
interesting, claude is calling something out here and i think it's valid? in that we should probably at least call this out as a change to how we expect encoded passwords to be handled...
will post directly from my claude review
There was a problem hiding this comment.
[Claude]: escape_password runs quote_plus unconditionally, and raw_password reverses it with unquote_plus. For passwords containing literal @, #, %, , etc., this round-trips correctly — the parametrized tests confirm that.
But the previous URI-based code path treated password as the already-escaped value to be embedded in a URI. If a deployment has historically set FIDES__DATABASE__PASSWORD=foo%40bar (i.e., the user manually URL-encoded @ as %40), the old URI path used foo%40bar in the URI, which psycopg2/asyncpg decode to foo@bar at parse time — correct.
With the new creator path, that same value flows through:
escape_passwordrunsquote_plus("foo%40bar")→"foo%2540bar"(the%itself gets escaped).raw_passwordrunsunquote_plus("foo%2540bar")→"foo%40bar"(the literal string).psycopg2.connect(password="foo%40bar")sendsfoo%40baras the password, notfoo@bar.
This is a silent behavior change for any user who was already pre-encoding their password in env vars or config files. It will only manifest as auth failures in production.
Options:
- Document this clearly as a breaking change for pre-encoded passwords, OR
- Make
escape_passwordidempotent (e.g., detect already-encoded values), OR - Add a release note + migration step.
At minimum, add a test that documents the round-trip behavior for a password literally containing %XX sequences so the contract is explicit.
There was a problem hiding this comment.
This isn't a behavior change. Both the old URI path and the new creator path produce the same result for all cases, including the pre-encoded foo%40bar scenario.
escape_password has always run quote_plus unconditionally on the raw config value. PostgresDsn.build() does not re-encode, so the old URI path was:
escape_password("foo%40bar")→"foo%2540bar"(stored)PostgresDsn.build(password="foo%2540bar")→ embeds as-is in URI- psycopg2 parses URI, URL-decodes →
"foo%40bar"sent to Postgres
The new creator path:
- Same
escape_password→"foo%2540bar"(stored) raw_password→unquote_plus("foo%2540bar")→"foo%40bar"psycopg2.connect(password="foo%40bar")→"foo%40bar"sent to Postgres
Same result in both paths. The pre-encoding case was already "broken" (or rather: passwords are expected to be raw, not pre-encoded). Will add a test documenting this contract.
…t, mutual exclusion guard - Add note to engine_creators.py docstring that creators must stay lightweight - Add test documenting that pre-encoded passwords are treated as literals - Add ValueError when creator is passed with database_uri or config - Add tests for creator + database_uri/config mutual exclusion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Part of ENG-3566
Description Of Changes
Refactors all 6 database engines to use SQLAlchemy's
creatorpattern instead of baking credentials into connection URIs. Thecreatorcallable is invoked by the pool on every new connection, resolving credentials at connect time rather than engine construction time. This is the structural seam required for dynamic credential rotation via AWS Secrets Manager (design doc: #8016).Today the creators read from static config (
CONFIG.database). In follow-up work, the credential helpers will be swapped to callprovider.get_secret()— no further engine changes needed.Also updates the design doc to remove the lazy factory requirement, since the creator closure captures a provider reference (not credentials), making it independent of when the engine is constructed.
Code Changes
database_settings.py— Addraw_password/raw_readonly_passwordproperties that reversequote_plusescaping for directpsycopg2.connect()/asyncpg.connect()useengine_creators.py(new) —make_sync_creator/make_async_creatorfactories that capture per-engine config (keepalives, SSL) in closures while resolving credentials per-connection. Includes credential helpers and asyncpg param/SSL conversionsession.py— Add optionalcreatorparameter toget_db_engine(). When provided, uses dialect-only URL and skips connect_args (creator handles them)ctl_session.py— Switch all 3 engines (async, readonly async, sync) from URI-based to creator-based. SSL context and asyncpg param handling moved intoengine_creators.pysession_management.py— Switch API and readonly sync engines to usemake_sync_creatorwith keepalive connect_argstasks/__init__.py— Switch Celery task engine to usemake_sync_creatorwith keepalive connect_argsdynamic-database-credentials.md— Remove lazy factory requirement from design docSteps to Confirm
GET /health/databaseand confirm all pools are healthyAsyncAdapt_asyncpg_connection)Pre-Merge Checklist
CHANGELOG.mdupdated