Skip to content

Upsert that changes datetime field can create duplicates across different indexes #571

@bountx

Description

@bountx

As in the title - upsert action can potentially create duplicates:

  1. POST /collections/{id}/bulk_items with body: { "method": "upsert", "items": {...} }

  2. When exist_ok=True (UPSERT):

  • No duplicate check is performed
  • Target index is determined by the new datetime
  • If item already exists in old index (with different datetime), it stays there
  • Item now exists in TWO indexes

Similair situation happens for PUT /items/{id} (update_item)

Proposed solutions

A) Validate conflict and throw an error 409 if item points at a different index then it would after update
B) Create smart upsert/put that checks indexes for unique key (item_id, collection_id) and if present does DELETE on the duplicate and inserts the new one in the right index

A is short and safe but breaks how upsert works (upsert that should normally call update would now throw an error - but we dont create a duplicate)

B is complicated and risky - instead of standard insert/update we would now have to call DELETE in the process risking deletion on an upsert call

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions