This directory contains a set of hands-on tutorials to kick-start the development and evaluation of learned sparse retrieval systems with the lsr-benchmark.
You can run the tutorials in Github Codespaces or Google Colab.
As the tutorials cover content of different stages of the lsr-benchmark, we recommend that you start with basic tutorials:
- tutorial-pre-computed-resources.ipynb. Provides an overview of the pre-computed resources of the
lsr-benchmarkthat you can re-use for your experiments and how to load and evaluate them. - tutorial-retrieval-engines.ipynb. Shows the basic interface that retrieval engines in the lsr-benchmark should fulfill and how you can evaluate your own implementations against existing runs.
- tutorial-access-raw-data.ipynb Shows how to access the raw underlying data before embedding.
- tutorial-embedding-models.ipynb. (Attention: In Progress.) Shows the basic interface that embedding systems in the lsr-benchmark should fulfill and how you can evaluate your own embeddings against existing ones.