Optimize dot product using restrict keyword (Fixes #4393)#4496
Optimize dot product using restrict keyword (Fixes #4393)#4496Vallabh-1504 wants to merge 1 commit intotesseract-ocr:mainfrom
Conversation
|
Thanks for addressing this issue. Did you compare the generated code? Before merging this pull request, I want to be sure that it really improves the code. |
|
hi, thanks for the review! I haven't compared the generated assembly for this specific build because I am working in a limited environment and relying on CI. However, I implemented this based on the request in the issue. The standard behavior of restrict (and __restrict on MSVC) is to tell the compiler that u and v do not overlap. This typically allows the compiler to skip runtime alias checks (loop versioning) and vectorize the loop more aggressively. Since I cannot generate the assembly locally, would you be able to verify if the output looks correct on your end? |
|
Test results:
|
stweil
left a comment
There was a problem hiding this comment.
A hardcoded __restrict is accepted by MSC++, g++ and clang++. Therefore I don't think we need TESS_RESTRICT. I suggest to wait with an update of the pull request until we know that it really has an effect.
|
Thanks for the feedback!
@amitdo – Since you opened the original issue, did you have a specific scenario (specific architecture or older compiler version) where this provided a benefit? I will wait for confirmation before updating the PR. |
|
I suggest to test this also on a machine with an amd64 CPU. The test run time should be long enough to reduce the influence of the initial data loading. |
|
The modified code has no effect on the generated binaries in all settings which I tested up to now. Therefore a runtime test is currently not needed. |
Description
addresses issue #4393 by adding the
restrictkeyword to the dot product functions. This informs the compiler that the input arrays do not overlap, enabling better SIMD vectorization and potential performance improvements.Changes
TESS_RESTRICTmacro insrc/arch/dotproduct.hto handle compiler differences:__restrictfor MSVC.__restrict__for GCC/Clang.TESS_RESTRICTon pointer arguments (uandv) in:DotProductNativeDotProductAVX/AVX512FDotProductSSEDotProductFMADotProductNEONVerification