Skip to content

Enable sdpa kv cache uint4#34918

Open
byungilm wants to merge 4 commits intoopenvinotoolkit:masterfrom
byungilm:enable_sdpa_kv-cache_uint4
Open

Enable sdpa kv cache uint4#34918
byungilm wants to merge 4 commits intoopenvinotoolkit:masterfrom
byungilm:enable_sdpa_kv-cache_uint4

Conversation

@byungilm
Copy link
Contributor

Details:

  • item1
  • ...

Tickets:

AI Assistance:

  • *AI assistance used: yes
  • *If yes, summarize how AI was used and what human validation was performed (build/tests/manual checks).
    Resolved invalid indexing issue by code static analyzing
    Code clean up by merging seperate int4 KV-cache kernel code to legacy sdpa_opt kernel

Signed-off-by: Min, Byungil <byungil.min@intel.com>
Signed-off-by: Min, Byungil <byungil.min@intel.com>
Signed-off-by: Min, Byungil <byungil.min@intel.com>
Signed-off-by: Min, Byungil <byungil.min@intel.com>
@byungilm byungilm requested review from a team as code owners March 25, 2026 09:52
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Mar 25, 2026

uint query_offset = head_idx_index + sglid;
unroll_for (uint seq_idx = 0; seq_idx < TARGET_SEQ_LEN_BLOCK_SIZE; seq_idx++) {
query_offset + = seq_idx * K_HEAD_SIZE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT, +=

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants