[GPU] Add squeeze/unsqueeze to the compressed conv1x1 transformation#34957
[GPU] Add squeeze/unsqueeze to the compressed conv1x1 transformation#34957mdvoretc-intel wants to merge 4 commits intoopenvinotoolkit:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an activation squeeze/unsqueeze pair around the MatMul emitted by ConvertWeightCompressedConv1x1ToMatmul when the activation’s leading dimension is statically 1, aiming to reduce activation rank to 3D to enable compile-time weight reordering on non-systolic devices.
Changes:
- Add optional activation squeeze before
MatMuland matching unsqueeze afterMatMulin the conversion pass. - Extend the existing transformation unit test to cover batched vs non-batched inputs and update the reference graph accordingly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/common/transformations/src/transformations/op_conversions/convert_weight_compressed_conv1x1_to_matmul.cpp |
Implements conditional squeeze/unsqueeze reshapes around the produced MatMul. |
src/common/transformations/tests/op_conversions/convert_weight_compressed_conv1x1_to_matmul_test.cpp |
Expands parameterized coverage (adds with_batched_input) and updates the reference model to match the new reshape behavior. |
| auto shape_out = activation->get_output_partial_shape(0); | ||
| auto squeeze_const = | ||
| std::make_shared<ov::op::v0::Constant>(ov::element::i64, | ||
| ov::Shape{3}, | ||
| std::vector<int64_t>{1, -1, shape_out[-1].get_length()}); | ||
| MatcherPass::register_new_node(squeeze_const); | ||
| auto squeeze = std::make_shared<ov::op::v1::Reshape>(activation, squeeze_const, false); |
There was a problem hiding this comment.
[BLOCKER] The new squeeze/unsqueeze reshape constants rely on shape_out[-1].get_length(). If the last dimension is dynamic (e.g., input has dynamic channel/feature dim), get_length() throws and the transformation will crash during pass execution. Consider building the reshape pattern without requiring static last dim (e.g., use special_zero=true with a 0 in the last position to copy the input’s last dimension), and apply the same fix to the unsqueeze block as well.
| auto shape_out = activation->get_output_partial_shape(0); | |
| auto squeeze_const = | |
| std::make_shared<ov::op::v0::Constant>(ov::element::i64, | |
| ov::Shape{3}, | |
| std::vector<int64_t>{1, -1, shape_out[-1].get_length()}); | |
| MatcherPass::register_new_node(squeeze_const); | |
| auto squeeze = std::make_shared<ov::op::v1::Reshape>(activation, squeeze_const, false); | |
| // Reshape [1, D1, ..., Dn] -> [1, -1, Dn] while safely copying the last dimension | |
| // without requiring it to be statically known. | |
| auto squeeze_const = | |
| std::make_shared<ov::op::v0::Constant>(ov::element::i64, | |
| ov::Shape{3}, | |
| std::vector<int64_t>{1, -1, 0}); | |
| MatcherPass::register_new_node(squeeze_const); | |
| auto squeeze = std::make_shared<ov::op::v1::Reshape>(activation, squeeze_const, true); |
| std::shared_ptr<ov::Node> act_node = input; | ||
| if (p.activation_op_type == "Reshape" && p.with_act_new_reshape) { | ||
| auto reshape_const = ov::opset1::Constant::create(ov::element::i64, ov::Shape{4}, {1, 1, 1, 10}); | ||
| auto reshape_const = ov::opset1::Constant::create(ov::element::i64, ov::Shape{4}, {1, 1, input_batch, 10}); | ||
| act_node = std::make_shared<ov::opset1::Reshape>(input, reshape_const, false); | ||
| } | ||
| if (input_batch == 1 || (p.activation_op_type == "Reshape" && p.with_act_new_reshape)) { | ||
| auto squeeze_const = ov::opset1::Constant::create(ov::element::i64, ov::Shape{3}, {1, input_batch, 10}); | ||
| act_node = std::make_shared<ov::opset1::Reshape>(act_node, squeeze_const, false); | ||
| } | ||
| auto matmul = std::make_shared<ov::op::v0::MatMul>(act_node, mul, false, true); | ||
| current_node = matmul; | ||
|
|
||
| if (p.with_bias) { | ||
| auto bias_const = ov::opset1::Constant::create(ov::element::f16, ov::Shape{1, 1, 1, 15}, {1}); | ||
| current_node = std::make_shared<ov::opset1::Add>(matmul, bias_const); | ||
| current_node = std::make_shared<ov::opset1::Add>(current_node, bias_const); | ||
| } | ||
| if (input_batch == 1 || (p.activation_op_type == "Reshape" && p.with_act_new_reshape)) { | ||
| auto unsqueeze_const = ov::opset1::Constant::create(ov::element::i64, ov::Shape{4}, {1, 1, input_batch, 15}); | ||
| current_node = std::make_shared<ov::opset1::Reshape>(current_node, unsqueeze_const, false); | ||
| } |
There was a problem hiding this comment.
[HIGH] The new squeeze/unsqueeze behavior is only validated with fully static shapes. Since the implementation currently depends on shape introspection (and should support dynamic channel dims once the get_length() issue is addressed), please add a regression test case where the activation’s last dimension is dynamic (or weights/output dim is dynamic) to ensure the transformation does not throw and preserves the expected output rank/shape.
...ormations/src/transformations/op_conversions/convert_weight_compressed_conv1x1_to_matmul.cpp
Outdated
Show resolved
Hide resolved
...ormations/src/transformations/op_conversions/convert_weight_compressed_conv1x1_to_matmul.cpp
Outdated
Show resolved
Hide resolved
| ov::Shape{4}, | ||
| std::vector<int64_t>{1, 1, -1, shape_out[-1].get_length()}); | ||
| MatcherPass::register_new_node(unsqueeze_const); | ||
| auto unsqueeze = std::make_shared<ov::op::v1::Reshape>(matmul_out, unsqueeze_const, false); |
There was a problem hiding this comment.
You're inserting Reshape, but calling it unsqueeze. Is this how it should be?
There was a problem hiding this comment.
The squeeze/unsqueeze operations are represented by reshapes here to avoid adding new node types at this stage in the pipeline.
Details:
Tickets:
AI Assistance: