[Proposal] Add a delegate method that handles db conflict for records being inserted from the cloud #360

hishnash · 2026-01-07T03:27:16Z

While it is nice to assume we can keep our db schema backwards compatible the reality is this leads to issues.

For example we we had a PostalAddress table that has an field enum for addressType and later wanted to let users extend this the logic step would be to create a new table AddressType and run a db migration that creates a row in that table for each enum case and update the PostalAddress to have a column that points to this AddressType table with a data migration to set the corresponding FK value into this filed on PostalAddress.

To ensure our local schema is clean we do not want to keep an old addressType enum filed alongside a new optional addressTypeID and then ensure every reference in the app that lookups up one also checks the other... so we removed the old addressType field after doing the data migration that created a AddressType row for each referenced address type and set the corresponding addressTypeID value.

But now imagine a user that had been using that app on the old version, removed the app from their device and then installed the app again (or more commonly got a new phone and thus got the new version of the app but without any local data).. as they open the app the data migration runs against the local (empty) database. As they keep the app open the users real data starts to sync in from iCloud, however this data has the old addressType enum values attached to it rather than the UUID of the new AddressType table rows. The users PostalAddress rows will not sync as the insert request will be missing the addressType ID that needs to be non optional.

I propose adding a handleUpsertFromServerRecord method to the delegate that is called if there is a DatabaseError when trying to upsert a local value. Within this method one can then write any custom logic (such as in this case create a AddressType, if needed, and include that addressTypeID in the insert request).

The other use case this can be important is if the user has 2 devices and they have only updated on one device but make db record changes on the other, the update even that come over the wire may end up not including a value for the new column and thus setting it to null. This is an issue even if the new column is optional as the row itself might even already have a value set but the update that comes through ends up nuking it. Being able to run whatever data migration logic we would apply locally to these insert requests (if a DB constraint is triggered) would be a great way to avoid such issues.

Alternatives considered:

Have a method that is called before insert/update/delete for each DB letting the delegate modify the CKRecord (or even suppress an update or delete). This is tempting but it would increase the cost of syncing for all users all the time, whereas the implemented solution only kicks in when a db constraint is triggered is much more tightly scoped.

TODO:

add examples in doc string of how to extract the needed data from CKRecord
add unit test
add to one or more of the example projects

lukaskubanek · 2026-01-07T15:23:48Z

Hi @hishnash, thanks for sharing this proposal. I’ve also been looking into customization of upserts recently, though from a slightly different angle. In my case, I’m after customizable conflict resolution rather than schema changes (see #272). I did consider hooking into the delegate as well, but decided against it in the end, as I found it’s better expressed via a more tailored API.

That said, I wanted to comment on your scenario, because I think that even with this hook you would still run into issues around cross-version compatibility. While newer versions of the app could be made to interpret both the old and the new schema variants, older versions would continue to assume the legacy shape. This can easily lead to errors (e.g. if the original enum-based addressType was non-optional) or to subtle data loss (e.g. when older clients write back incomplete data).

To ensure our local schema is clean we do not want to keep an old addressType enum field alongside a new optional addressTypeID and then ensure every reference in the app that lookups up one also checks the other…

I’m afraid that, in a distributed sync environment, you’ll generally need some form of overlap where both representations are handled if older app versions are expected to keep working alongside newer ones. There may be better ways of structuring this, but I don’t think removing fields is an appropriate measure here, which also aligns with SQLiteData’s guidance around disallowed migrations.

mbrandonw · 2026-01-07T16:42:46Z

Hi @hishnash, for this kind of proposal it would help to see a concrete example of how it could be used.

While it is nice to assume we can keep our db schema backwards compatible the reality is this leads to issues.

Unfortunately the reality is that DB schemas must be backwards compatible, and there really is no way around it. It's the only way to allow for old and new schemas to communicate with each other.

The delegate action you are proposing may work for new schemas interpreting data from old schemas, but it does nothing to help old schemas interpret data from new schemas. And it can't possibly because apps with the old schema were shipped long before we knew the demands on the new schema.

This does mean that over time your schema may become less than ideal because you are forced to make compromised decisions to keep backwards compatibility. Sometimes that can be mitigated with various techniques, but also it is just the reality by having a schema defined on device as well as remotely on iCloud.

The other use case this can be important is if the user has 2 devices and they have only updated on one device but make db record changes on the other, the update even that come over the wire may end up not including a value for the new column and thus setting it to null. This is an issue even if the new column is optional as the row itself might even already have a value set but the update that comes through ends up nuking it. Being able to run whatever data migration logic we would apply locally to these insert requests (if a DB constraint is triggered) would be a great way to avoid such issues.

I mentioned this in our conversation over Slack, but this is not how SQLiteData is supposed to work, and if you are seeing this behavior it would be a bug. We have test coverage on this exact scenario, and I just added more, and so I feel somewhat confident that it works. Each version of the app keeps around a full CKRecord of the last known server record, which includes fields that they may not even have represented in their schema. And this is the record that is updated and sent back to iCloud, and so it should never be possible for an old app to update a record and cause data loss.

But if you definitely saw this data loss then we will need a lot more info on how to reproduce it. There could definitely be a bug in our code, and we would rather fix that than consider all new ways to provide compatibility between different schemas.

hishnash · 2026-01-07T18:41:12Z

I did responsed in the slack channel but will it here as well.

I wander if our issue stems from the fact that we originally creates some of our records as fixtures within DB migrations (thus outside the runtime of the Sync engine) so the sync engine never created a CKRecord for the record when they were inserted into the DB. We also apply the data migration that adds the FK relationship and populate it within the migration that creates the relationship.

As to providing backwards comparability for users that have 2 devices where one is on the new version and the other is on the old version. We do not intended to support that and have already trigged prompts on users devices to show up telling them they much update the device that is on the older version.

As in the example above if one were to keep the enum addressType field around and then in apps code base need to constantly handle if an Address has changed the value in this filed (but not the FK value) code will end up rather dirty. The only good solution here would be to insert our own onUpdate hook that then runs to correct the data but this would then run on every update (even updates running within the app not just on incoming sync events).

I would love it if it were possible to setup some SQLight triggers that are scoped to only run for sync insert/update/deletes but I dont think SQLight supports transition stopping on these so any other updates that happen in the background at the same time a the sync is running would end up triggering it. Also I am a little concerned that such triggers might make the mentioned CKRecord diverge again from the db state?

mbrandonw · 2026-01-07T21:37:00Z

I wander if our issue stems from the fact that we originally creates some of our records as fixtures within DB migrations (thus outside the runtime of the Sync engine) so the sync engine never created a CKRecord for the record when they were inserted into the DB. We also apply the data migration that adds the FK relationship and populate it within the migration that creates the relationship.

Yes this would not be a good thing to do in general. You should not make any writes to the database before the sync engine is setup.

As to providing backwards comparability for users that have 2 devices where one is on the new version and the other is on the old version. We do not intended to support that and have already trigged prompts on users devices to show up telling them they much update the device that is on the older version.

That is an example you provided yourself in the original description of this PR, so I assumed it was relevant to you. Also, is it really the case that if you release the app twice just a few days apart you are going to force people who download the first app to upgrade to the newest one? I can only imagine that is going to be a very frustrating experience to force all users to be on the absolute newest version of the app.

The only good solution here would be to insert our own onUpdate hook that then runs to correct the data but this would then run on every update (even updates running within the app not just on incoming sync events).

Yep, that is a good solution, and exactly what we suggest. And you can distinguish writes to the database from your app versus writes made by the sync engine using SyncEngine.isSynchronizing(). Docs are here.

hishnash · 2026-01-07T21:50:31Z

You should not make any writes to the database before the sync engine is setup.

So we cant do data migrations (or fixture population) within a traditional DB migration flow? We would need to create some other follow up hook that runs as a side effect of having just run a schema migration. Eg if schema migration 0001 just ran then after sync engine starts run schema migration 0001's related data migration.

Or would there be a way we can just track what records have been updated/inserted/deleted during the data migration and then tap the sync engine on the shoulder when it starts up (in effect calling the same methods in the sync engine that the db triggers would have called had the changes happened while it was running)?

The reason I mention multiple versions is that when the data migration runs (whenever it is) we cant be certain all changes have been synced up from other devices let alone synced down to the users device that has the new version. So the user on the old device might well get a prompt telling them to upgrade (this only happens if there is a schema change of course so not ever update) but while that prompt is on screen the that users device might well still have pending changes it has not yet streamed up to iCloud so even through the shame has effectively changed updates (and inserts) with the old schema will still be in flight on the wire.

I am a little concerned that as the app evolves if we depend on all migrations to use DB trigger we are going to end up with a very messy schema that is very difficult to reason with.

mbrandonw · 2026-01-08T22:07:11Z

So we cant do data migrations (or fixture population) within a traditional DB migration flow? We would need to create some other follow up hook that runs as a side effect of having just run a schema migration. Eg if schema migration 0001 just ran then after sync engine starts run schema migration 0001's related data migration.

Any changes made to your database before the SyncEngine is running will not be seen by the sync engine, and so will not be sent to iCloud. But also these kinds of "data migrations" are hard to get right. They are going to be run on every device, even if another device already ran them. For example, device 1 could have run the data migration creating record A, B, C, then it deletes A, and when device 2 runs for the first time it will recreate A, B, C, even though A had been deleted by device 1.

This is just the complication of having each device be in control of the schema of the database, and so you have to be careful with these things. You can't just approach the problems exactly as you would in a more traditional database set up prior to synchronization.

Or would there be a way we can just track what records have been updated/inserted/deleted during the data migration and then tap the sync engine on the shoulder when it starts up (in effect calling the same methods in the sync engine that the db triggers would have called had the changes happened while it was running)?

You can always keep track of the rows you inserted into the DB during the data migration and then after the sync engine is set up do the following:

let idsCreatedDuringMigration = […]
try await database.write { db in 
  try Record
    .find(idsCreatedDuringMigration)
    .update { $0.id = $0.id }
    .execute(db)
}

That will cause the sync engine to see the records change (even though nothing really changed), and will enqueue changes into CKSyncEngine to be sent out later. But you still have the potential problem I mentioned above.

I am a little concerned that as the app evolves if we depend on all migrations to use DB trigger we are going to end up with a very messy schema that is very difficult to reason with.

What do you mean "all migrations to use DB trigger"? You should be designing your schema carefully enough so that these kinds of problems are rare. This should not be happening all of the time.

And again, this is just the complication of having each device be in control of the schema. The same is true of SwiftData, and honestly the problem is much more pernicious over there.

And to go back to the original purpose of this PR, you still need to explain exactly how this change will be useful. As far as I can tell, it does not help when older devices receive data from newer schemas. Are you sure this actually solves anything? Can you show a concrete example of how it would be used?

hishnash · 2026-01-10T00:38:52Z

it does not help when older devices receive data from newer schemas.

Correct however I don't think that is something one can deal with anyway, at least not if you want to let that older version then make subsequent edits. The issue is that older version does not know about the new shame nor does it know about the constraints that might existing upon it within logic let alone db constraints.

What I have done in the past

In projects in the past were I have dealt with the ability for database schema to diverge (due to child db shards being on-prem at remote off grid locations that later need to sync in changes when they come online) the most robust solution that avoids conflicts as been to allow only one directional data flow.

A new version of the DB can accept changes from an old version (and apply needed data migrations before writing them into the db)

But a new version of the shamea can not be accepted by an older version as this leads to the described loops were local validation logic on the older version is unable to matinain logical constraints that the newer version requires.

## In the app we are woking on right now

In our situations we would impose this by syncing through a schema version information and if a client is on an older schema version than the most recent version seen on any record we would place the client into a read only state with a related prompt to inform the user to update the app.

This avoids the possibility of the older version making changes that break things and keeps the schema clean.

For us our users will be very update if data is lost or corrupted, and it is not data they can easily recreate as it would require them to recall exactly what they did and when.

The above PR might well not be the best way to provide a data migration prior to insert.

Maybe what would be better is have the method that creates the query instead create a dictionary of values (along with mutation data and schema version etc) and then pass that (if configured) to a delegate or other method that can mutate that data before the insert is evaluated.

Or

create a proxy table that as all the fields (as optional) or all schema versions (so you can insert records) and then at the end of the transition copy these to the real tables. With the option to replace that copy operation with a custom operation (this is what we did in past projects to allow us to stream in changes from out of data remote db clients that were async and thus not up to data with the scheme) in effect we wrote a custom insert into real table from proxy table data migration. What was very nice about this is that it was easy to test, we could just load different versions of data into the proxy table and test the SQL that did the select upset/delete into the real tables and let our streaming from remote still do bulk inserts so we could handle very large blocks of data in one go rapidly. However this lib currently at lest just do upsert/delete on a per record basis so maybe it is simpler to go with the option were a handler method in swift is called that mutates the values to match. (of course this is more limited from a data migration perceptive as it is per row were a SQL level operation can do things like merge rows when inserting into the real tables).

Add delegate method

6483702

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Proposal] Add a delegate method that handles db conflict for records being inserted from the cloud #360

[Proposal] Add a delegate method that handles db conflict for records being inserted from the cloud #360

Uh oh!

hishnash commented Jan 7, 2026 •

edited

Loading

Uh oh!

lukaskubanek commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 7, 2026

Uh oh!

hishnash commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 7, 2026

Uh oh!

hishnash commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 8, 2026

Uh oh!

hishnash commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Proposal] Add a delegate method that handles db conflict for records being inserted from the cloud #360

Are you sure you want to change the base?

[Proposal] Add a delegate method that handles db conflict for records being inserted from the cloud #360

Uh oh!

Conversation

hishnash commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Alternatives considered:

Uh oh!

lukaskubanek commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 7, 2026

Uh oh!

hishnash commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 7, 2026

Uh oh!

hishnash commented Jan 7, 2026

Uh oh!

mbrandonw commented Jan 8, 2026

Uh oh!

hishnash commented Jan 10, 2026

What I have done in the past

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hishnash commented Jan 7, 2026 •

edited

Loading