Hi,
I had very bad scores using the autofaiss build_index method and except that the bug could have been caused from the fact that when you build_index you don't save the fname that corresponds to each index ... I also noticed something utterly strange. In particular, the index is not self-consistent. That means if I use the same set of embeddings that I used to build an index to do retrieval I don't always get the same point (yet the reconstruction error is zero when building the index).
import numpy as np
from autofaiss import build_index
data_path = 'images'
features_dir = 'features'
files = [f for f in os.listdir(data_path)]
full = [join(path, f) for f in os.listdir(data_path)]
index = build_index(embeddings=np.concatenate([np.load(f) for f in full], axis=0), nb_cores=12, save_on_disk=False)[0]
how_many = 0
for f in os.listdir(features_dir):
query_vector = np.load(join(features_dir, f))
distances, indices = index.search(query_vector, 1)
distances, indices = np.squeeze(distances), np.squeeze(indices)
how_many += int(files[indices[0]] != f)
print(which_many)
Which outputs for my data around 178 (out of 1000000 points).
Is this a bug - or do I need different input parameters to make this work properly?
I use faiss==1.7.4 and autofaiss==2.15.8.
Thank you,
ysig
Hi,
I had very bad scores using the autofaiss
build_indexmethod and except that the bug could have been caused from the fact that when youbuild_indexyou don't save the fname that corresponds to each index ... I also noticed something utterly strange. In particular, the index is not self-consistent. That means if I use the same set of embeddings that I used to build an index to do retrieval I don't always get the same point (yet the reconstruction error is zero when building the index).Which outputs for my data around
178(out of 1000000 points).Is this a bug - or do I need different input parameters to make this work properly?
I use
faiss==1.7.4andautofaiss==2.15.8.Thank you,
ysig