Add fix script for rewriting STA ScheduledStopPoint IDs by leonardehrenfried · Pull Request #114 · MMTIS/badger

leonardehrenfried · 2026-05-20T11:09:50Z

This script rewrites some IDs in STA's EPIP feed.

skinkie · 2026-05-20T13:24:41Z



-def _update_refs(obj: Any, id_map: dict[str, str]) -> bool:
+def _update_refs(obj: Any) -> bool:


Is your intention here to iterate over all references?

Yes, everything that can have a reference to the ScheduledStopPoint.

I agree that it's super generic and handles lots of cases that I don't have in my data set.

I think you can omit most of the code by def only_references(deserialized: Tid, serializer: Serializer) -> Generator[tuple[type[EntityStructure], str, str], None, None]: which does the recursive stuff.

skinkie · 2026-05-20T15:48:26Z

Can you add to this pull request also a test for loading and fixing a file?

leonardehrenfried · 2026-05-20T16:02:18Z

The current feed this operates on is 500mb. Do you know of a tool of shrinking netex feeds down to a single journey?

skinkie · 2026-05-20T16:11:32Z

The current feed this operates on is 500mb. Do you know of a tool of shrinking netex feeds down to a single journey?

See conv.filter_db_to_db :)

leonardehrenfried · 2026-05-21T05:15:14Z

Actually, If the code is good, I would prefer to merge this now. I will send a follow up with a test.

skinkie · 2026-05-21T06:54:55Z

Nope, I want to see how it behaves, and prevent regressions.

leonardehrenfried · 2026-05-21T07:02:34Z

Can you point towards an example that I should emulate?

All I can see are tests that appear to be reading from places like /mnt/storage/compressed/ret-epip.lmdb and then never assert anything.

skinkie · 2026-05-21T07:32:31Z

I want the code to be running. Hence I don't care at this point about asserting, I care about the code path to be touched. Hence a small subset of 10 stops in a file. Going into mdbx. Fix the result. Export to XML would be good enough.

leonardehrenfried · 2026-05-21T11:16:20Z

I used the following filter to get a single ServiceJourney:

uv run python -m conv.filter_db_to_db sta.lmdb ServiceJourney it:apb:ServiceJourney:86345-Pizzin-33-1-41880:345D: sta-reduced.lmdb

It produced this: https://p.ip.fi/zbul

Should I commit this to the repo?

leonardehrenfried · 2026-06-08T10:47:47Z

@skinkie Can you look at the test?

skinkie · 2026-06-25T13:34:06Z

+def fix_ssp_ids(database: Path) -> None:
+    with MdbxStorage(database, readonly=False) as db:
+        with db.env.rw_transaction() as txn:
+            # TODO: delete the old ScheduledStopPoint objects (no delete API available yet)


For deleting the following steps must be assured in this order:

the id of the object itself must be renamed

all internal references must be updated, hence at least ScheduledStopPointRef, TimingPointRef nameOfRefClass="ScheduledStopPoint", ObjectRef (NoticeAssignment), rewriting should cause the updating the referencing

the old relationship between objects must be deleted

the key with the old object must be deleted

We have avoided such operations, so we fill a new database with the context, and not try to do such invasive operations in place.

So you never really delete but simply filtering them out when copying to a new database?

There are two facets here. The way we have worked was always to transfer from database to database when doing any transformation, so from NeTEx to NeTEx. The (inline) fix operations work well on attribute level like projection of all coordinates from a national grid to WGS84.

What you are doing here would match something like the EPIP conversion. Do all the transformations, write the output into the second database, and copy_map everything that remains stable. https://github.com/MMTIS/badger/blob/binary_relation_serializer/conv/epip_db_to_db.py#L181

The effect is that anything related to referential relationships are never updated, only created.

So in effect, the code to achieve such thing is virtually the same, but source is copied, and transformed, then written to the target.

The second facet is, that we have always overwritten the key. This is not the case when the id is changed, thus the key changes.

leonardehrenfried added 2 commits May 20, 2026 13:09

Add fix script for rewriting STA ScheduledStopPoint IDs

5ca31c5

Use generators everywhere

81bc47a

leonardehrenfried mentioned this pull request May 20, 2026

Resolve Mentz line versions, fix South Tyrolian SSP ids #113

Draft

skinkie reviewed May 20, 2026

View reviewed changes

Use recursive_attributes

fbf72e2

leonardehrenfried force-pushed the sta-ssp-id branch from e874a07 to fbf72e2 Compare May 20, 2026 13:45

skinkie reviewed May 20, 2026

View reviewed changes

Comment thread fix/rewrite_sta_ssp_ids.py Outdated

Inline method

070c3b4

Add test script

2eda928

skinkie reviewed Jun 25, 2026

View reviewed changes



		def _update_refs(obj: Any, id_map: dict[str, str]) -> bool:
		def _update_refs(obj: Any) -> bool:

Uh oh!

Conversation

leonardehrenfried commented May 20, 2026

Uh oh!

skinkie May 20, 2026

Choose a reason for hiding this comment

Uh oh!

leonardehrenfried May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

skinkie May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

skinkie commented May 20, 2026

Uh oh!

leonardehrenfried commented May 20, 2026

Uh oh!

skinkie commented May 20, 2026

Uh oh!

leonardehrenfried commented May 21, 2026

Uh oh!

skinkie commented May 21, 2026

Uh oh!

leonardehrenfried commented May 21, 2026

Uh oh!

skinkie commented May 21, 2026

Uh oh!

leonardehrenfried commented May 21, 2026

Uh oh!

leonardehrenfried commented Jun 8, 2026

Uh oh!

skinkie Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

leonardehrenfried Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

skinkie Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leonardehrenfried May 20, 2026 •

edited

Loading

skinkie Jun 25, 2026 •

edited

Loading