gemm x86 support out_elemtype, multiheadattention and sdpa x86 support bf16 storage, skip mha bf16 tests by nihui · Pull Request #6623 · Tencent/ncnn

nihui · 2026-03-30T08:33:44Z

No description provided.

codecov-commenter · 2026-03-30T08:37:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.10%. Comparing base (18a7ad1) to head (c87a5e1).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6623      +/-   ##
==========================================
+ Coverage   93.45%   94.10%   +0.65%     
==========================================
  Files         874      667     -207     
  Lines      280098   238244   -41854     
==========================================
- Hits       261758   224199   -37559     
+ Misses      18340    14045    -4295

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tencent-adm · 2026-03-30T08:43:03Z

Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Copilot

Pull request overview

This PR extends x86 compute paths to better support bf16 storage and Gemm output element type selection, and updates the test suite accordingly (including temporarily skipping MultiHeadAttention bf16 variants).

Changes:

Add output_elemtype handling to the x86 bf16 Gemm implementation so bf16 inputs can produce fp32 outputs.
Enable bf16 storage support flags for x86 MultiHeadAttention and SDPA, adjusting internal execution to accommodate bf16 storage.
Add a new Gemm test (test_gemm_5.cpp) and update test utilities to skip MultiHeadAttention bf16 testing.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/testutil.cpp	Skips MultiHeadAttention bf16 tests; adds missing `delete op` on Vulkan skip paths (but early-return cleanup still incomplete).
tests/test_gemm_5.cpp	New Gemm test covering `output_elemtype=fp32` across shapes/transposes.
src/layer/x86/sdpa_x86.cpp	Enables bf16 storage and updates intermediate/output allocations and memcpy sizes to respect bf16 elemsize.
src/layer/x86/multiheadattention_x86.cpp	Enables bf16 storage; forces certain sublayers to fp32 and adds a bf16→fp32 cast for V before qkv gemm.
src/layer/x86/gemm_x86.cpp	Threads `output_elemtype` through bf16 Gemm path and allocates/stores fp32 when requested.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/testutil.cpp

src/layer/x86/multiheadattention_x86.cpp

tests/testutil.cpp

Copilot

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_gemm_5.cpp

src/layer/x86/sdpa_x86.cpp

tests/testutil.cpp

nihui and others added 10 commits March 30, 2026 02:20

wip

14562a7

wip

c8c5ccc

wip

a25bf86

wip

290c5b7

wip

44499b0

apply code-format changes

e18e437

test gemm out_elemtype

219cde4

test gemm out_elemtype

22cdbde

skip

9d907e8

fix leak

56ffccb

github-actions bot added test x86 labels Mar 30, 2026

f

133af57

opt

665848e

nihui requested a review from Copilot March 30, 2026 08:48

Copilot started reviewing on behalf of nihui March 30, 2026 08:49 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

tests/testutil.cpp Show resolved Hide resolved

src/layer/x86/multiheadattention_x86.cpp Outdated Show resolved Hide resolved

tests/testutil.cpp Show resolved Hide resolved

tests/testutil.cpp Show resolved Hide resolved

tests/testutil.cpp Show resolved Hide resolved

nihui added 3 commits March 30, 2026 09:03

w

44f532b

f

776f36c

f

ae8052a

nihui requested a review from Copilot March 30, 2026 09:17

Copilot started reviewing on behalf of nihui March 30, 2026 09:17 View session

Copilot AI reviewed Mar 30, 2026

View reviewed changes

nihui added 3 commits March 30, 2026 11:42

cc

444341c

cc

7eeeddb

cc

c87a5e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemm x86 support out_elemtype, multiheadattention and sdpa x86 support bf16 storage, skip mha bf16 tests#6623

gemm x86 support out_elemtype, multiheadattention and sdpa x86 support bf16 storage, skip mha bf16 tests#6623
nihui wants to merge 18 commits intoTencent:masterfrom
nihui:sdpa-x86-bf16s

nihui commented Mar 30, 2026

Uh oh!

codecov-commenter commented Mar 30, 2026 •

edited

Loading

Uh oh!

tencent-adm commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nihui commented Mar 30, 2026

Uh oh!

codecov-commenter commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tencent-adm commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Mar 30, 2026 •

edited

Loading