Skip to content

[GPU] Revert arg max min axis change and fix incorrect offset calculation#31956

Merged
p-durandin merged 1 commit intoopenvinotoolkit:masterfrom
clee30:arg_max_min_increase_threads
Sep 8, 2025
Merged

[GPU] Revert arg max min axis change and fix incorrect offset calculation#31956
p-durandin merged 1 commit intoopenvinotoolkit:masterfrom
clee30:arg_max_min_increase_threads

Conversation

@clee30
Copy link
Copy Markdown
Contributor

@clee30 clee30 commented Sep 3, 2025

In arg_max_min_axix, use sizeof to determine size of struct iav_type.
Current method have issue to calculate for fp16 type as OpenCL compiler tend to add padding to improve memory access efficiency.

Revert "[GPU] Use local memory for arg_max_min for dynamic shape (#31682)"
This reverts commit 69a1121.

CVS-172937, CVS-172939

@clee30 clee30 requested review from a team as code owners September 3, 2025 13:31
@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Sep 3, 2025
@p-durandin p-durandin enabled auto-merge September 3, 2025 14:19
@clee30
Copy link
Copy Markdown
Contributor Author

clee30 commented Sep 4, 2025

Able to fix the test failure by increase the size of the local memory when # of local threads increase. However, it is not reliable for dynamic support. During graph compilation, the dynamic shape is not known, choose between local/global memory to use for temp buffers are not reliable. There will be the case where if local memory has been choosen, but during runtime when dynamic shape already known, the size may exceed local memory limit. In this case, it is not able to revert back to use global memory.

Will revert to use local memory, and fix the issue where data mismatch when using global memory temp buffer.

In arg_max_min_axix, use sizeof to determine size of struct iav_type.
Current method have issue to calculate for fp16 type as OpenCL compiler tend to add padding to improve memory access efficiency.

Revert "[GPU] Use local memory for arg_max_min for dynamic shape (openvinotoolkit#31682)"
This reverts commit 69a1121.
@clee30 clee30 force-pushed the arg_max_min_increase_threads branch from 6697761 to 8d7be39 Compare September 5, 2025 04:53
@clee30
Copy link
Copy Markdown
Contributor Author

clee30 commented Sep 5, 2025

Able to fix the test failure by increase the size of the local memory when # of local threads increase. However, it is not reliable for dynamic support. During graph compilation, the dynamic shape is not known, choose between local/global memory to use for temp buffers are not reliable. There will be the case where if local memory has been choosen, but during runtime when dynamic shape already known, the size may exceed local memory limit. In this case, it is not able to revert back to use global memory.

Will revert to use local memory, and fix the issue where data mismatch when using global memory temp buffer.

Add fix to original implementation using global memory to handle fp16 size correctly.

@clee30 clee30 changed the title [GPU] Regression issue for arg max min. Revert lws setting [GPU] Revert arg max min axis change and fix incorrect offset calculation Sep 5, 2025
@p-durandin p-durandin added this pull request to the merge queue Sep 5, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 5, 2025
@p-durandin p-durandin added this pull request to the merge queue Sep 8, 2025
Merged via the queue into openvinotoolkit:master with commit ddc697f Sep 8, 2025
268 of 276 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: GPU OpenVINO GPU plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants