Skip to content

NeighborhoodGraph share std::mt19937 among different threads #428

@lych4o

Description

@lych4o

Describe the bug
In https://github.com/microsoft/SPTAG/blob/main/AnnService/inc/Core/Common/NeighborhoodGraph.h#L320

#pragma omp parallel for schedule(dynamic)
                for (int i = 0; i < m_iTPTNumber; i++)
                {
                    Sleep(i * 100); std::srand(clock());
                    for (SizeType j = 0; j < m_iGraphSize; j++) TptreeDataIndices[i][j] = j;
                    std::shuffle(TptreeDataIndices[i].begin(), TptreeDataIndices[i].end(), rg);
                    PartitionByTptree<T>(index, TptreeDataIndices[i], 0, m_iGraphSize - 1, TptreeLeafNodes[i]);
                    SPTAGLIB_LOG(Helper::LogLevel::LL_Info, "Finish Getting Leaves for Tree %d\n", i);
                }

The rg is a global variable that shares between threads, which may cause a array out of index in std::shuffle() since the rg may generate an index that exceed the size of TptreeDataIndices[i]. That index will be used in std::shuffle to swap items so the out of range error happens.

To Reproduce
I don't know the exact way to reproduce this, I find it by trying to build index on SIFT100M Dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions