Skip to content

Systematically compare/investigate performance on Zarrs #54

@yarikoptic

Description

@yarikoptic

@satra did just a quick&dirty test using zarr viewer comparing access directly to S3 URL and then our testbed dandi.centerforopenneuroscience.org instance (on a zarr in 000108 under /dandisets/ not the manifests based one since no manifest for it was there yet) so there are no quantified characteristics available but

  • it took notably longer to load from our instance than from S3 when zooming in
  • overall browsing of the dav instance got significantly slower/less responsive

We need to come up with

  • some sample script which implements basic IO (e.g. reading the slice X) on a given zarr
  • time a single individual access run
  • time N parallel accesses to different or the same slices (assess scalability here as well)

Compare

  • direct S3
  • direct dandidav (no apache reverse proxy setup)
  • dandi.centerforopenneuroscience.org

to identify how much any component contributes and if we could improve any aspect (e.g. parallel access etc). I hope that it is not all simply due to redirects.

Later we need to create some nice "versioned zarr" with a few versions and to use for benchmarks and also benchmark on the /zarr endpoints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    high priorityWork on these firstperformanceEfficient use of time and space

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions