[LLVMCPU] Implement conv heuristics in getVectorPreProcStrategy#24004
[LLVMCPU] Implement conv heuristics in getVectorPreProcStrategy#24004josephbak wants to merge 1 commit intoiree-org:mainfrom
Conversation
Signed-off-by: Joseph Bak <joseph.bak31415@gmail.com>
hanhanW
left a comment
There was a problem hiding this comment.
Thanks for the patch, I wonder if you did any benchmarking for the change? What is the outcome of this PR?
| assert(!getLoweringConfig(convOp) && "expected lowering_config is not set"); | ||
|
|
||
| // Masking is not yet wired for convs (no pipelineConfig branch exists). | ||
| // TODO: wire Masking support for convs. |
There was a problem hiding this comment.
I dont follow this. Masking support should be available. It is not flipped because we don't look at conv perf for a while on CPU. What do you mean by this comment?
On the other hand, we should not silently fallback here. It is better to have the logic in getVectorPreProcStrategy, so we wont spread logic everywhere.
There was a problem hiding this comment.
Thanks for the review!
Benchmarks: Ran iree-benchmark-module on M-series AArch64 NEON (local-task) — inconclusive so far on tested shapes. Hunting odd-dim convs (KH=3) next; will add numbers soon.
Masking: Got it — moving conv strategy logic to getVectorPreProcStrategy, removing setConvRootConfig fallback. Update coming shortly.
Summary
Named conv ops previously returned
VectorPreProcStrategy::Noneunconditionally ingetVectorPreProcStrategy, bypassing all target-aware heuristics. This caused two problems:allowIncompleteTilestayed false, forcing the tiler to find exact divisors and producing suboptimal tile sizes for odd kernel shapes (e.g. KH=3, KW=3).Remove the early exit so named convs fall through to the existing target-aware heuristics. In
setConvRootConfig, clampMaskingtoNone(not yet wired for convs) and connectvecPreProcStrategytodistConfig.allowIncompleteTile, mirroring the matmul path insetContractionRootConfig.Testing
Tested locally on AArch64 NEON. x86 (no AVX-512) and RISC-V are expected to follow the same Peeling path but rely on CI for validation. SVE (Masking) is explicitly clamped to None pending future wiring.
Notes
select_aarch64_lowering_strategy.mlirdue to pipeline format mismatch (CPUDoubleTilingExpertvs#iree_cpu.pipeline<DoubleTilingExpert>) unrelated to this fix.setConvRootConfig.Assisted-by: Claude (Anthropic)