-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
Request Description
1. Motivation & Model Examples
As discussed in PR #33633 with @mitruska and @nshchego, the current v11::TopK and earlier operations have implementation-defined NaN ordering behavior. Because different frontend frameworks handle NaNs differently (e.g., NumPy treats them as smallest, PyTorch treats them as largest), there is a need for a deterministic, configurable approach to NaN handling in OpenVINO.
Model Examples Benefiting from this:
- Multimodal AI Models (CLIP, Vision Transformers): Embeddings can occasionally produce
NaNvalues due to numerical instabilities in FP16/BF16 projections. IfTopKpropagates these NaNs unpredictably, it corrupts downstream similarity searches. - RAG (Retrieval-Augmented Generation) Pipelines: When retrieving the top
Krelevant document chunks, a single rogueNaNsimilarity score can currently push valid, highly-relevant documents out of the TopK results, breaking the retrieval chain entirely.
2. Proof of Concept (POC)
Following @mitruska's recommendation to prepare a POC to define constraints and benefits, I have built a comprehensive standalone C++ implementation here:
POC Repository & README: Lagmator22/TopK-NaN-OpenVINO
The POC demonstrates a proposed nan_mode attribute with three explicit modes:
NAN_AS_SMALLEST(Matches NumPy behavior: NaNs are pushed to the bottom of descending sorts)NAN_AS_LARGEST(Matches PyTorch behavior: NaNs are pushed to the top of descending sorts)NONE(Undefined/Fastest path: Preserves strict backward compatibility with current OpenVINO performance footprints)
3. Constraints & Benchmarking
To ensure no performance regressions for existing users, the POC includes micro-benchmarks comparing the NONE path against the NAN_AS_SMALLEST/NAN_AS_LARGEST paths.
- When
nan_mode = NONEis selected, the sorting completely bypasses the NaN checks, ensuring identical performance to the currentv11::TopK. - When explicit handling is requested, the overhead is minimal and safely isolated.
4. Next Steps
Based on these findings, I am proposing the introduction of v17::TopK referencing this nan_mode structure. I would appreciate it if the maintainers could review the POC implementation and let me know if this architectural direction aligns with the project's vision!
(Note: I plan to write the upstream PR for v17::TopK as part of my ongoing open-source contributions to OpenVINO).
CC: @mitruska @nshchego @kblaszczak-intel @praasz
Feature Use Case
No response
Issue submission checklist
- The feature request or improvement must be related to OpenVINO