Fantastic project here, and congrats @leg2015 and team on the paper.
Just a note, GBIF monthly snapshots are available as partitioned parquet files from https://registry.opendata.aws/gbif/, which can be faster than hitting GBIF's own API.
e.g. in python
import ibis
gbif = ibis.read_parquet("s3://gbif-open-data-us-east-1/occurrence/2024-10-01/occurrence.parquet/**")
Or in R
library(duckdbfs)
gbif <- open_dataset("s3://gbif-open-data-us-east-1/occurrence/2024-10-01/occurrence.parquet/**")