[CPU][ARM] Int8 conv swish fq fusion by Passavee-Losripat · Pull Request #34931 · openvinotoolkit/openvino

Passavee-Losripat · 2026-03-25T16:53:24Z

Re-enables ConvolutionTransformation on ARM by removing CPU_DISABLE_PASS_ARM
in transformation_pipeline.cpp
Extends ConvMulAddFQBlock pattern matcher to optionally match an activation node
between Add and FakeQuantize, enabling recognition of
Conv -> Mul -> Add -> Activation -> FQ in addition to the existing
Conv -> Mul -> Add -> FQ. Currently supports Swish and Relu via
wrap_type<Swish, Relu>
- designed to handle for other activation in addition to Swish support in the future
  without structural changes
Improve ACLConvolutionExecutor to accept Activation + FakeQuantize as simultaneous
post-ops, replacing the hard single-post-op limit with an iteration loop
Updates ConvertConvolutionBias and FallbackUnsupportedLPConvToFP16 to retrieve
and handle the optional activation anchor without breaking existing patterns
Extends canFuse() in conv.cpp to allow FakeQuantize fusion after a single Eltwise
activation is already fused, enabling the full post-op chain to reach the ACL executor

This PR is related to GSoC2026 Project 5 Optimize Quantized Model Inference Performance on ARM Devices with OpenVINO

AI assistance used: yes
Claude was used for writing external analyzing script and explaining codebase. All code in this PR was manually written and validated through YOLO26 detection accuracy on Apple M4 Max (ARM64). Analysis artifacts are not included in this PR.

Passavee-Losripat added 4 commits March 26, 2026 01:54

enable int8 convolution transformation

a90baff

support combination of activation + fakequantize

6324f53

support optional activation type

a294b2b

allow fakequantized after fusion

c3d0c62

Passavee-Losripat force-pushed the int8-conv-swish-fq-fusion branch from 25f294b to c3d0c62 Compare March 25, 2026 16:54

Provide feedback