Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
255 commits
Select commit Hold shift + click to select a range
69d4796
[FIX] Remove staged artifacts
PiotrKrzem Feb 27, 2025
d3b31fc
Merge branch 'master' into feature/paged_reference
PiotrKrzem Feb 28, 2025
b8c9742
[FIX] Add RoPE to testing suite
PiotrKrzem Feb 28, 2025
687d6aa
[FIX] Add missing test cases:
PiotrKrzem Feb 28, 2025
0986e6d
[FIX] Test build
PiotrKrzem Feb 28, 2025
dfe28b0
[FIX] Single op graph
PiotrKrzem Feb 28, 2025
c582c82
[FIX] Visitor test
PiotrKrzem Feb 28, 2025
abbea22
[FIX] Use reference_tests::Tensor in tests
PiotrKrzem Feb 28, 2025
6ccf30d
[FIX] test case name:
PiotrKrzem Feb 28, 2025
a9ebf4f
[FIX] Separate extension, update dependencies
PiotrKrzem Feb 28, 2025
7c2c449
[FIX] Compilation errrors
PiotrKrzem Feb 28, 2025
fc38fcb
[FIX] Refactor for unit testing
PiotrKrzem Mar 4, 2025
3e41603
[FIX] Re-add 40 tests with computed RoPE
PiotrKrzem Mar 4, 2025
4cbb5fd
[FIX] Remove from ops16, apply review comments
PiotrKrzem Mar 4, 2025
ad45c38
[FIX] Build errors from refactor
PiotrKrzem Mar 4, 2025
b0d4c0c
[FIX] Inline funcs to supress warning
PiotrKrzem Mar 4, 2025
ddd129e
[FIX] Clang
PiotrKrzem Mar 4, 2025
5c21ba1
[FIX] Comparison dtype error
PiotrKrzem Mar 4, 2025
1c12a5d
[FIX] Add unit funcs to named namespace
PiotrKrzem Mar 4, 2025
8091215
[FIX] Rename namespace
PiotrKrzem Mar 4, 2025
e1ff55d
[FIX] clang
PiotrKrzem Mar 4, 2025
7bf188a
[FIX] template func lookup
PiotrKrzem Mar 4, 2025
7f0acd1
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 4, 2025
6a22c16
[FIX] Add func to headers to avoid unused warn
PiotrKrzem Mar 5, 2025
2d460e0
[FIX] GPU build namespace err
PiotrKrzem Mar 5, 2025
4aa34e5
Merge branch 'master' into feature/paged_reference
mlukasze Mar 5, 2025
1626784
[FIX] Explicitly call ref func
PiotrKrzem Mar 5, 2025
c90519c
[FIX] Clang
PiotrKrzem Mar 5, 2025
80cb9a9
[FIX] Param list err
PiotrKrzem Mar 5, 2025
08c7a1f
[FIX] Params err pt2
PiotrKrzem Mar 5, 2025
64a0fd2
[FIX] Review comments exc internal namespace
PiotrKrzem Mar 5, 2025
8225f28
[FIX] Remove internal namespace
PiotrKrzem Mar 5, 2025
d6ee9ef
[FIX] Remove PagedAttn from internal opset
PiotrKrzem Mar 5, 2025
cf3e05a
[FIX] Remove from internal namespace pt2
PiotrKrzem Mar 5, 2025
933c596
[FIX} Cleanup of v16 and remaining artifacts
PiotrKrzem Mar 5, 2025
9d6c9a1
Merge branch 'master' into feature/paged_reference
mlukasze Mar 6, 2025
145fbd3
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 6, 2025
4d1cf62
[FIX] Tests
PiotrKrzem Mar 6, 2025
b679e8f
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Mar 6, 2025
4087530
Update src/core/reference/include/openvino/reference/paged_attention.hpp
mmikolajcz Mar 6, 2025
4184529
Merge branch 'openvinotoolkit:master' into feature/paged_reference
PiotrKrzem Mar 9, 2025
97033ac
Update src/core/src/op/paged_attention.cpp
PiotrKrzem Mar 9, 2025
b6015fd
Update src/core/src/op/paged_attention.cpp
PiotrKrzem Mar 9, 2025
8709d3a
[FIX] Minor fixes to tests and ref
PiotrKrzem Mar 9, 2025
2818126
[FIX] Build testing suite
PiotrKrzem Mar 9, 2025
c70fe1d
[FIX] Multiplies initial value
PiotrKrzem Mar 9, 2025
a437d1d
Fix some issues with reference test cases. Some issues are still ther…
mmikolajcz Mar 10, 2025
0b5aeaf
Initial draft of functional shared single layer tests for PagedAttention
mmikolajcz Mar 20, 2025
b018c5a
Improve PagedAttention test structure and test case naming
mmikolajcz Mar 21, 2025
1624be2
Apply changes to reference impl
mmikolajcz Mar 21, 2025
0e649d4
Apply requested changes
mmikolajcz Mar 21, 2025
eb85af4
Add scale func tests
mmikolajcz Mar 21, 2025
4dbb868
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 25, 2025
0de5177
[FIX] k,v heads, fixed alibi formula, cache copy, minor review fixes
PiotrKrzem Mar 25, 2025
fa43f64
[FIX] Struct for params, ultimate code purify
PiotrKrzem Mar 30, 2025
a627646
[FIX] int32 build errors
PiotrKrzem Mar 31, 2025
b132ee2
[FIX] size_t iont32_t mismatch fix
PiotrKrzem Mar 31, 2025
f64ee26
[FIX] Key int32_t error
PiotrKrzem Mar 31, 2025
f948bfd
[FIX] Tests compilation vec2str
PiotrKrzem Mar 31, 2025
2a360e3
Split k and v head size
mmikolajcz Apr 15, 2025
03dcf55
[ADD] Cache manager simulation, eviction, 2 new inputs, 2 new outputs…
PiotrKrzem May 5, 2025
fdf61ca
Merge branch 'master' into feature/paged_reference
PiotrKrzem May 5, 2025
d33e037
[FIX] Build bugfix
PiotrKrzem May 5, 2025
9375a33
[FIX] sliding_window unused
PiotrKrzem May 5, 2025
6484555
[FIX] 5th output build errors
PiotrKrzem May 5, 2025
d25e073
[FIX] Type prop tests with new inputs
PiotrKrzem May 5, 2025
fb44591
[FIX] Compatibility rank checks, minor code fixes, tests classes fixes
PiotrKrzem May 12, 2025
38fcda4
Merge branch 'master' into feature/paged_reference
PiotrKrzem May 12, 2025
b94f552
[FIX] Type prop tests after merge
PiotrKrzem May 12, 2025
16825a3
[FIX] Namespace error
PiotrKrzem May 12, 2025
9094308
[FIX] Namespace error
PiotrKrzem May 12, 2025
ae2c84c
[FIX] Clang
PiotrKrzem May 12, 2025
14f9c31
[REVERT] Revert 5 outputs, review comments
PiotrKrzem May 30, 2025
100321f
[ADD] Debug prints, new requested test cases
PiotrKrzem May 30, 2025
dc31bb9
[FIX] Commented tests for clarity:
PiotrKrzem Jun 18, 2025
fcfd3e3
Merge branch 'master' into feature/paged_reference
PiotrKrzem Jul 21, 2025
c8b8010
[WIP] Add cache manager on apr with genai
PiotrKrzem Jul 21, 2025
1cea130
Merge branch 'master' into feature/paged_reference
mlukasze Jul 23, 2025
f682490
[FIX] Compilation errors
PiotrKrzem Jul 30, 2025
5af811d
[FIX] Cache eviction with working block logic
PiotrKrzem Aug 1, 2025
49b0316
[ADD] Inserter of cache into models, replace all key_cache. and value…
PiotrKrzem Aug 5, 2025
ec0d3d6
[FIX] Rewire and improve compiled model and sync infer request for ca…
PiotrKrzem Aug 7, 2025
b1a04bb
[FIX] Clean code, minor logic fixes
PiotrKrzem Aug 8, 2025
657d401
[FIX] CompiledModel dependency
PiotrKrzem Aug 8, 2025
0998633
[FIX] Build errors
PiotrKrzem Aug 8, 2025
adb2d81
[FIX] Clang
PiotrKrzem Aug 10, 2025
ba6de21
[FIX] Remove relocation artifact
PiotrKrzem Aug 11, 2025
c26b31a
Merge branch 'master' into feature/paged_reference
PiotrKrzem Aug 11, 2025
5a4d8db
[FIX] set_out
PiotrKrzem Aug 11, 2025
7dbfc79
Merge branch 'master' into feature/paged_reference
PiotrKrzem Aug 11, 2025
36752fe
[FIX] Shape inference
PiotrKrzem Aug 13, 2025
826f550
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Aug 18, 2025
10a68e2
Merge branch 'master' into feature/paged_reference
PiotrKrzem Aug 18, 2025
5d30421
Merge branch 'master' into feature/paged_reference
PiotrKrzem Aug 26, 2025
6123418
[FIX] Name to iName to index insertion of cache
PiotrKrzem Aug 26, 2025
155373b
[ADD/FIX] Tests for CM, fix building errors, clang
PiotrKrzem Aug 27, 2025
e2a7678
[FIX] Gods of Cmake please let this work
PiotrKrzem Aug 28, 2025
52607db
[FIX] Clang
PiotrKrzem Aug 28, 2025
5fab303
[FIX] Android build error
PiotrKrzem Aug 28, 2025
4fe3c45
[FIX] Android build pt2
PiotrKrzem Aug 28, 2025
c02e65d
[FIX] Cmake pt 2
PiotrKrzem Aug 28, 2025
b8961af
Merge branch 'master' into feature/paged_reference
PiotrKrzem Aug 29, 2025
4875de5
[FIX] Android clang error pt3
PiotrKrzem Sep 1, 2025
e75886c
git pushMerge branch 'feature/paged_reference' of https://github.com/…
PiotrKrzem Sep 1, 2025
bdaad83
Merge branch 'master' into feature/paged_reference
PiotrKrzem Sep 2, 2025
5cfef8c
Merge branch 'master' into feature/paged_reference
PiotrKrzem Sep 3, 2025
a0d1257
Merge branch 'master' into feature/paged_reference
PiotrKrzem Sep 9, 2025
fda736b
Merge branch 'master' into feature/paged_reference
mlukasze Sep 16, 2025
1a5ec8d
[WIP][ADD] CacheManager version 2 with globally managed single memory…
PiotrKrzem Sep 26, 2025
b2135ef
Merge branch 'master' into feature/paged_reference
PiotrKrzem Sep 29, 2025
0fd4677
[FIX] Remove redundane cache
PiotrKrzem Sep 30, 2025
7552cce
[FIX] Move reference cache to core
PiotrKrzem Sep 30, 2025
f40f1f0
[FIX] PagedCache build errors pt1
PiotrKrzem Sep 30, 2025
0578a3d
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Sep 30, 2025
41e6a39
Merge branch 'master' into feature/paged_reference
PiotrKrzem Sep 30, 2025
c2db68a
Merge branch 'master' into feature/paged_reference
mlukasze Oct 3, 2025
086f358
[FIX] Use node as the key ID, fix memory access error, fix build erro…
PiotrKrzem Oct 4, 2025
cc14b54
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Oct 4, 2025
33fa48d
[FIX] Build fixes pt 3
PiotrKrzem Oct 4, 2025
e06a603
[FIX] Review comments, remove Ref tests for CPU tests, fix liner errors
PiotrKrzem Oct 7, 2025
3abb9f5
[ADD] Ref vs CPU test
PiotrKrzem Oct 7, 2025
7f0eff3
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 8, 2025
23fc025
[FIX] Linker errors from circular dependencies, cache uninitialized e…
PiotrKrzem Oct 10, 2025
2ef1e13
git pushMerge branch 'feature/paged_reference' of https://github.com/…
PiotrKrzem Oct 10, 2025
4eea05a
[FIX] Compile node error
PiotrKrzem Oct 10, 2025
f66f366
[FIX] Build error with new ID
PiotrKrzem Oct 12, 2025
e141585
[FIX] Link PCM to OV_API
PiotrKrzem Oct 12, 2025
894881c
[FIX] Shape infer with new inputs
PiotrKrzem Oct 13, 2025
b9c0938
[FIX] Arguemnts list error
PiotrKrzem Oct 13, 2025
3d87b42
[ADD] Debug message for C++ shape infer
PiotrKrzem Oct 13, 2025
d032535
[FIX] Macos whitespace
PiotrKrzem Oct 14, 2025
50374ba
[ADD] Debug flags for PA inputs
PiotrKrzem Oct 14, 2025
af5d423
[FIX] Debug prints
PiotrKrzem Oct 14, 2025
5a6aa18
[FIX] More debug prints
PiotrKrzem Oct 14, 2025
08a5e27
[FIX] Even more debug prints
PiotrKrzem Oct 14, 2025
bd00bbe
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 14, 2025
e27262c
[FIX] 21 inputs shape infer error
PiotrKrzem Oct 14, 2025
dc87a3d
[FIX] Tensor accessor
PiotrKrzem Oct 14, 2025
c6c4183
[FIX] Remove check for static shape for past lens
PiotrKrzem Oct 14, 2025
287b160
[FIX] Allow 2-5 rank cache, limit to 4 rank for ref
PiotrKrzem Oct 14, 2025
4f4d435
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 20, 2025
94e1653
Update attach_cache_manager_to_paged_attention.cpp
PiotrKrzem Oct 28, 2025
8f7faf4
Update attach_cache_manager_to_paged_attention.hpp
PiotrKrzem Oct 28, 2025
5bdd66c
Update paged_attention.hpp
PiotrKrzem Oct 28, 2025
7b3fd7d
Update paged_attention.hpp
PiotrKrzem Oct 28, 2025
4b4e302
Update paged_cache_manager.cpp
PiotrKrzem Oct 28, 2025
537ea9b
Update paged_attention.hpp
PiotrKrzem Oct 28, 2025
41d69c2
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 28, 2025
e4a143c
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 28, 2025
1e380f8
Merge branch 'master' into feature/paged_reference
PiotrKrzem Oct 30, 2025
fb4b9f7
[FIX] Force undo changes, fix without namespace
PiotrKrzem Nov 3, 2025
c884ca2
[FIX] Clang
PiotrKrzem Nov 3, 2025
144d549
Update simplify_shape_of_sub_graph.hpp
PiotrKrzem Nov 3, 2025
89df626
Merge branch 'master' into feature/paged_reference
PiotrKrzem Nov 3, 2025
12d874c
[DEBUG] Temporary revert of changes to check conditional compilation CI
PiotrKrzem Nov 4, 2025
3258736
Merge branch 'master' into feature/paged_reference
PiotrKrzem Nov 4, 2025
c51c607
[FIX] Double down by style aligning to other common opt
PiotrKrzem Nov 5, 2025
beb5c11
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Nov 5, 2025
6675ee3
[FIX] Namespace change fix for CM
PiotrKrzem Nov 5, 2025
5aafcbe
[FIX] Style
PiotrKrzem Nov 5, 2025
03d2cfe
[FIX] Ref uses util PCM
PiotrKrzem Nov 5, 2025
1a95889
Try fix CC build 1
praasz Nov 20, 2025
6f92adb
Fix CC build 2
praasz Nov 20, 2025
8e6cbaa
Try fix CC build 3
praasz Nov 20, 2025
0325f02
[WIP][FIX] Review suggestions pt1
PiotrKrzem Nov 24, 2025
a905c13
[WIP][FIX] Review suggestions pt2
PiotrKrzem Nov 24, 2025
3a17578
[WIP][FIX] Review suggestions pt3
PiotrKrzem Nov 24, 2025
762f8e8
[WIP][FIX] Clang
PiotrKrzem Nov 24, 2025
c4cf062
[WIP][FIX] Convert fix, ov alignedbuffer introduction
PiotrKrzem Dec 1, 2025
04dea11
[WIP][FIX] Clang
PiotrKrzem Dec 1, 2025
f382256
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 2, 2025
5a785a0
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 3, 2025
f8c21be
Merge branch 'master' into feature/paged_reference
mlukasze Dec 3, 2025
f69e0e2
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 3, 2025
e1f1d46
Update src/tests/functional/base_func_tests/src/base/utils/generate_i…
PiotrKrzem Dec 4, 2025
275824d
[FIX][WIP] Resolve remaining majority of issues, blocked by relocatio…
PiotrKrzem Dec 4, 2025
f40b062
[FIX] Style
PiotrKrzem Dec 4, 2025
df9940d
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Dec 4, 2025
198b0fb
[FIX][WIP] Opaque ptr, conversions, minor fixes
PiotrKrzem Dec 9, 2025
2962170
[FIX] Clang
PiotrKrzem Dec 9, 2025
5cd5b22
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 9, 2025
b64dda2
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 18, 2025
b978c8c
[WIP][FIX] Build
PiotrKrzem Dec 18, 2025
6face9e
[FIX] Clang
PiotrKrzem Dec 18, 2025
f833c4b
Merge branch 'master' into feature/paged_reference
PiotrKrzem Dec 18, 2025
652f72c
Merge branch 'master' into feature/paged_reference
PiotrKrzem Jan 15, 2026
9fc5ec3
[FIX] CPU Ref tests XAttn
PiotrKrzem Jan 15, 2026
97cb81b
[FIX] Shape inference of 3rd shape, CPU tests
PiotrKrzem Jan 20, 2026
b5d173e
Merge branch 'master' into feature/paged_reference
PiotrKrzem Jan 28, 2026
f6f20ae
[FIX] C4273 build error
PiotrKrzem Jan 28, 2026
942696c
[FIX] Clang
PiotrKrzem Jan 28, 2026
d9fa322
[FIX] Clang pt2
PiotrKrzem Jan 28, 2026
0732b5e
[FIX] Clang3
PiotrKrzem Jan 28, 2026
e15e1f2
[FIX][WIP] CPU reference tests
PiotrKrzem Jan 30, 2026
152fb68
[FIX][WIP] Fix for cache management pt2 for CPU func tests
PiotrKrzem Feb 3, 2026
9e2576c
[FIX][WIP] Disable SDPA transformation for Ref comparison
PiotrKrzem Feb 3, 2026
89f8ec1
[FIX][WIP] Fix shape building and simplify code for PA CPU tests
PiotrKrzem Feb 3, 2026
ac05f85
[FIX] Shape inference critical error for dynamic evictable sizes
PiotrKrzem Feb 4, 2026
6481a3a
Merge branch 'master' into feature/paged_reference
PiotrKrzem Feb 4, 2026
1320ec6
[FIX] Rewrite tests for clear comparison
PiotrKrzem Feb 4, 2026
c680446
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Feb 4, 2026
c89e8ea
[FIX] Revert old test class to master
PiotrKrzem Feb 4, 2026
956751c
[FIX] C4273 fix for Windows build pt2
PiotrKrzem Feb 4, 2026
0a21789
[FIX] Provide void handle definiton inclass
PiotrKrzem Feb 4, 2026
f24bdda
[FIX] Clang, test fixture includes
PiotrKrzem Feb 4, 2026
85ada87
[FIX] Neutralize quantization transformation for KV cache
PiotrKrzem Feb 4, 2026
5385ca9
[FIX] Test only statis data types:
PiotrKrzem Feb 4, 2026
2e94de1
Merge branch 'openvinotoolkit:master' into feature/paged_reference
PiotrKrzem Feb 4, 2026
341bf78
[FIX] Dimension inference for CPU fixture and cache manager
PiotrKrzem Feb 12, 2026
e409ff5
[WIP][WORKAROUND] Set same dims for all dims to avoid layout mismatches
PiotrKrzem Feb 12, 2026
08f935e
Merge branch 'master' into feature/paged_reference
PiotrKrzem Feb 12, 2026
cbe78a6
[WIP][WORKAROUND] Add second test from CPU with same dims
PiotrKrzem Feb 12, 2026
2c8d002
Merge branch 'master' into feature/paged_reference
PiotrKrzem Feb 20, 2026
94984e7
[FIX] Shape CPU bugs
PiotrKrzem Feb 20, 2026
e017d15
[TMP]
PiotrKrzem Feb 24, 2026
1269348
[FIX] Disabled inputs have 0-dim
PiotrKrzem Feb 24, 2026
7494e3f
[FIX] CPU failure with isSupportedOp
PiotrKrzem Mar 3, 2026
970cdde
[FIX] Disable tests on non x64 devices
PiotrKrzem Mar 3, 2026
6d5739e
[FIX] Copyrights
PiotrKrzem Mar 3, 2026
7a782d2
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 4, 2026
62e2bc4
[FIX] Incorrect file extension in AARCH64 linker
PiotrKrzem Mar 4, 2026
6dcd9b9
Update src/tests/functional/plugin/shared/include/shared_test_classes…
PiotrKrzem Mar 4, 2026
f2e7127
[FIX] Data type, potentially dangerous change
PiotrKrzem Mar 4, 2026
9e42e23
Merge branch 'feature/paged_reference' of https://github.com/PiotrKrz…
PiotrKrzem Mar 4, 2026
71ac7af
[FIX] Very messy fix for reference shape infer, test of CIs
PiotrKrzem Mar 4, 2026
f78c938
[ADD] Advanced test cases, reorganize messy solution from before
PiotrKrzem Mar 4, 2026
bef3953
[FIX] 21-input working version
PiotrKrzem Mar 4, 2026
5b2f530
[FIX] Enable adaptive RKV using reference RKV op
PiotrKrzem Mar 4, 2026
6bbbdf3
[FIX] Cleanup of prints, comments, consistency changes, minor style a…
PiotrKrzem Mar 4, 2026
8e7704d
[FIX] Minor KV adapt error
PiotrKrzem Mar 4, 2026
3bf2155
[FIX] Remove debug prints
PiotrKrzem Mar 4, 2026
a73ba1c
[FIX] Clang
PiotrKrzem Mar 4, 2026
9030ac0
Merge branch 'master' into feature/paged_reference
mlukasze Mar 5, 2026
076474a
[FIX] Unused variable left from cleanup
PiotrKrzem Mar 5, 2026
52f18e5
Update src/plugins/template/backend/ops/paged_attention.cpp
PiotrKrzem Mar 5, 2026
e14bb5d
[FIX] Copilot review suggestions
PiotrKrzem Mar 5, 2026
d302718
[FIX] Copilot review, score aggreg window mini fix, template registra…
PiotrKrzem Mar 5, 2026
d506010
[ADD] Tests for the edge case of 0 aggregation
PiotrKrzem Mar 5, 2026
7191a26
[FIX] Re-add CacheManager Score & adaptive RKV policies
PiotrKrzem Mar 5, 2026
aa81ffa
[FIX] Minor adjustments and notes for future reference
PiotrKrzem Mar 5, 2026
23a313c
[FIX] Build for CM tests
PiotrKrzem Mar 5, 2026
b88e744
[FIX] Unused variable
PiotrKrzem Mar 5, 2026
2b9991f
[FIX] ARKV tests, comments cleanup from past outdated methodologies
PiotrKrzem Mar 6, 2026
eb84fb0
[FIX] Relocate the pooling and adaptive rkv parameters, add tests wit…
PiotrKrzem Mar 6, 2026
783fd33
[FIX] Clang
PiotrKrzem Mar 6, 2026
9a13bee
[FIX] Separate CPU and Template plugin, PR review fixes
PiotrKrzem Mar 23, 2026
d5fcbd5
[FIX] Build error nr 2
PiotrKrzem Mar 23, 2026
5238c55
[FIX] Sync with 26 input version
PiotrKrzem Mar 30, 2026
902ad88
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 30, 2026
5ec9f67
Merge branch 'master' into feature/paged_reference
PiotrKrzem Mar 30, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 41 additions & 1 deletion src/core/dev_api/openvino/op/paged_attention.hpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Copyright (C) 2018-2026 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#pragma once

#include "openvino/op/op.hpp"
Expand All @@ -10,16 +11,55 @@ namespace op {

// This is an experimental operation that is implemented in the plugins.
// Do not use in user applications, backward compatibility is not guaranteed in future releases.

/// \brief PagedAttentionExtension operation implements paged attention for memory-efficient sequence processing
///
/// \ingroup ov_ops_cpp_api
///
/// This operation computes attention using a paged memory model, allowing efficient handling of long sequences
class OPENVINO_API PagedAttentionExtension : public ov::op::Op {
public:
OPENVINO_OP("PagedAttentionExtension");

PagedAttentionExtension() = default;

PagedAttentionExtension(const ov::OutputVector& args);
/// \brief Constructs a PagedAttentionExtension operation
///
/// \param args Input arguments vector containing:
/// (B_token = total tokens in the call, B_seq = number of sequences,
/// H = query heads, Hk = key/value heads, S = head size)
///
/// 0 query [B_token, H * S] required
/// 1 key [B_token, Hk * S] required
/// 2 value [B_token, Hk * S] required
/// 3 key_cache [num_blocks, Hk, Bs, S] required
/// 4 value_cache [num_blocks, Hk, Bs, S] required
/// 5 past_lens [B_seq], i32 required
/// 6 subsequence_begins [B_seq + 1], i32 required
/// 7 block_indices [total_blocks], i32 required
/// 8 block_indices_begins [B_seq + 1], i32 required
/// 9 scale [] scalar required
/// 10 sliding_window [] scalar, i32 required (0 = unlimited)
/// 11 alibi_slopes [H] or empty required (empty = disabled)
/// 12 max_context_len [] scalar, i32 required (0 = unlimited)
/// 13 score_aggregation_window [] scalar or [B_seq], i32 required (0 = disabled)
/// 14 rotated_block_indices [Nrot] or empty, i32 required (empty = disabled)
/// 15 rotation_deltas [Nrot] or [Nrot, Bs], i32 required (empty = disabled)
/// 16 rotation_trig_lut [C, S] or [C*S] required (empty = disabled)
/// 17 xattention_threshold [] or [B_seq] required (empty = disabled)
/// 18 xattention_block_size [] scalar, i32 required (0 = disabled)
/// 19 xattention_stride [] scalar, i32 required
/// 20 sinks [1, H, 1, 1] or empty required (empty = disabled)
/// 21 adaptive_rkv_start_size [] scalar, i32 required (0 = no protection zone)
/// 22 adaptive_rkv_evictable_sizes [B_seq], i32 optional
/// 23 adaptive_rkv_diversity_block_set_indices [num_adaptive_rkv_blocks] optional
/// 24 adaptive_rkv_diversity_block_set_indices_begins [B_seq + 1], i32 optional
explicit PagedAttentionExtension(const ov::OutputVector& args);

void validate_and_infer_types() override;
std::shared_ptr<ov::Node> clone_with_new_inputs(const ov::OutputVector& new_args) const override;

const ov::element::Type get_out_type(int index) const;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is required?
The getters/setters should be for attributes?
is not same as node::get_output_element_type()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's valid question, why resolved without response?

void set_out_type(int index, const ov::element::Type& output_type);

protected:
Expand Down
Loading
Loading