Skip to content

Restore NPUW_DQ fallback for older drivers#33621

Merged
Maxim-Doronin merged 2 commits intoopenvinotoolkit:masterfrom
Maxim-Doronin:md/restore_npuw_dq_fallback
Jan 16, 2026
Merged

Restore NPUW_DQ fallback for older drivers#33621
Maxim-Doronin merged 2 commits intoopenvinotoolkit:masterfrom
Maxim-Doronin:md/restore_npuw_dq_fallback

Conversation

@Maxim-Doronin
Copy link
Copy Markdown
Contributor

Details:

Tickets:

  • E#198339

@Maxim-Doronin Maxim-Doronin requested review from a team as code owners January 15, 2026 14:52
@github-actions github-actions bot added category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin labels Jan 15, 2026
Copy link
Copy Markdown
Contributor

@AsyaPronina AsyaPronina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Great catch!!

@dmatveev dmatveev added this to the 2026.0 milestone Jan 15, 2026
Comment on lines +1207 to +1210
// Specify NPUW DQ if Compiler DQ is not enabled
if (!npudesc.has_value() || !npudesc->compiler_dq) {
config.emplace("NPUW_DQ", "YES");
}
Copy link
Copy Markdown
Contributor

@dmatveev dmatveev Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change certainly brings back the missed behavior, but after reviewing the history thoroughly I am not quite sure if the OLD behavior was correct.

The OLD behavior was first introduced here: #28343

The logic we bring back is: "use NPUW_DQ if the compiler DQ is not present". But, if I remember correctly, NPUW_DQ is the compiler DQ. They come together. So one can't substitute another, they come in pair. The idea here is that to make the compiler DQ work, we sometimes need to transform a model a certain way. If the compiler DQ isn't available as in older drivers, we need to tranfrom the model even more (the FULL NPUW-side DQ).

UPD: NPUW_DQ_FULL is on by default, so enabling NPUW_DQ here gives us NPUW_DQ_FULL automatically. It is obscure but seem to work (see below).

Mnemonics in the property description confirm this:

Looking at the default values - https://github.com/openvinotoolkit/openvino/blob/2025.4.0/src/plugins/intel_npu/src/al/include/intel_npu/config/npuw.hpp#L111

  • NPUW_DQ is false (probably a rudiment)
  • NPUW_DQ_FULL is true

Now looking into the configuration building:

This logic seem to be good for the moment. Remember this is the baseline common configuration that is used as a basis for prefill & generate stages.

But later, when we refine the PREFILL model config, we do something obscure: https://github.com/openvinotoolkit/openvino/blob/2025.4.0/src/plugins/intel_npu/src/plugin/npuw/llm_compiled_model.cpp#L1171 - strangely enough this change is introduced by the same original commit 57025dc

UPD2: The obscurity is deciphered above.

The old logic (in red) seem to make more sense than the new one (in green):

Image

Previously (red), we've set DQ_FULL (to avoid the full transformation) to NO if and only IF compiler supported DQ, that made sense. Now (green) this condition is reversed, but in the case when compiler DQ is not present, we set NPUW_DQ (instead of NPUW_DQ_FULL that is supposed to handle this case). That's clearly a miss. that also includes NPUW_DQ_FULL as that one wasn't disabled.

Same thing happened for the GENERATE model - we didn't find the capability but we still set _DQ (not _DQ_FULL that is supposed to be there).

Initially, we've only had NPUW_DQ that did the full transformation. Later, when compiler-side DQ has came in, we've provided the past behavior under NPUW_DQ_FULL, and used NPUW_DQ to do the compiler-friendly transformation (only impacting group-quantized models). I beleive the combination of this rename & some "refactoring" in the original commit caused the issue confusion (UPD2).

TL;DR: the old behavior is restored, but the old behavior is sus

UPD: More archeology

  1. NPUW_DQ was the die hard one in the beginning: NPUW: Introduce DQ #26362 (did the full transformation to the model)
  2. NPUW_DQ_FULL was introduced later as an early return in the die-hard NPUW_DQ path: [NPUW] Introduce DQ_FULL property #27678

DQ and DQ_FULL don't inverse each other. DQ_FULL will only work if it is ON while DQ is ON.

So here comes UPD2;

Comment on lines 1160 to 1162
if (npudesc.has_value() && npudesc->compiler_dq) {
config.emplace("NPUW_DQ", "YES");
config.emplace("NPUW_DQ_FULL", "NO");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later thought:

So.. if we DONT hit this condition (say, we DONT have compiler DQ), we stay with the default values:

  1. NPUW_DQ false
  2. NPUW_DQ_FULL true

With the way these options are handled, no DQ transformations will be applied to the model - https://github.com/openvinotoolkit/openvino/blob/2025.4.0/src/plugins/intel_npu/src/plugin/npuw/partitioning/partitioning.cpp#L2193

If we leave it this way and don't do any later refinements, DCOFF will kick in as it did (the issue with lower NPU performance and higher CPU load).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..and later we enable NPUW_DQ to get NPUW_DQ_FULL enabled along with that, so the past behavior seem to be correct, and the fix seem to be correct too.

Copy link
Copy Markdown
Contributor Author

@Maxim-Doronin Maxim-Doronin Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments! Follow-up task has been created: E#199512

@Maxim-Doronin Maxim-Doronin added this pull request to the merge queue Jan 16, 2026
Merged via the queue into openvinotoolkit:master with commit 1cc90c2 Jan 16, 2026
182 checks passed
@Maxim-Doronin Maxim-Doronin deleted the md/restore_npuw_dq_fallback branch January 16, 2026 12:51
Naseer-010 pushed a commit to Naseer-010/openvino that referenced this pull request Feb 18, 2026
### Details:
- Restoring the logic added in
openvinotoolkit#28343 that was unfairly
removed in openvinotoolkit#30554 by
mistake

### Tickets:
 - E#198339
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: NPU OpenVINO NPU plugin category: NPUW NPUW plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants