[Backport 2026.1] Skip FP16 compression for constants with high absolute roundtrip error by mryzhov · Pull Request #34885 · openvinotoolkit/openvino

mryzhov · 2026-03-24T10:20:47Z

Description:

Summary

Extend the scalar FP16 error check from PR #34110 to non-scalar constants
Add max absolute roundtrip error threshold (1.0) to CompressFloatConstantsImpl
Out-of-range values are excluded from the check (handled separately by the 75% threshold with clamping)
Details
compress_float_constants.cpp already skips FP16 compression for scalar constants with high relative roundtrip error (PR #34110). However, non-scalar constants with large values (>1024) can have absolute FP16 error exceeding 1.0 due to limited mantissa resolution, and these were not checked.

This caused accuracy degradation in LTX-Video FP16 export: a RoPE cosine frequency table (341 elements, values up to ~31416, max FP16 abs error 7.93) was compressed to FP16 and then applied multiplicatively to Q/K vectors across 28 transformer blocks in a 50-step denoising loop, compounding the error. WWB similarity improved from 0.831 to 0.956 (FP32 baseline: 0.984).

The threshold of 1.0 is chosen because FP16 ULP reaches 1.0 only at value range [1024, 2048]. Normal neural network weights (typically in [-10, 10]) have max absolute error ~0.005, so no false positives are expected. Size impact is negligible — only a few small frequency/scale constants per model stay in FP32.

Tickets:

180611

Extend the existing scalar error check (PR openvinotoolkit#34110) to non-scalar constants. Add max absolute FP16 roundtrip error threshold (1.0) that protects constants like RoPE frequency tables where large values (>1024) lose significant precision in FP16. Out-of-range values are excluded from the check as they are already handled by the 75% threshold with clamping. This fixes catastrophic accuracy degradation in LTX-Video FP16 export (WWB similarity 0.831 -> 0.956) caused by a RoPE cosine frequency table (341 elements, max FP16 abs error 7.93) being compressed to FP16. The corrupted positional encoding compounded through 28 blocks x 50 denoising steps x 2 CFG passes. CVS-180611 (cherry picked from commit 18e681d)

Copilot

Pull request overview

Backport that improves CompressFloatConstants accuracy by skipping FP16 compression for non-scalar FP32/FP64 constants whose FP16 roundtrip introduces large absolute error, extending the earlier scalar-only relative-error safeguard (PR #34110).

Changes:

Add an absolute FP16 roundtrip error check for non-scalar constants (threshold: 1.0) to decide whether to skip compression.
Keep existing scalar relative-error check (1e-4) unchanged and apply it only to numel == 1.
Add regression tests covering both skipping (high abs error) and compressing (low abs error) non-scalar constants.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`src/common/transformations/src/transformations/common_optimizations/compress_float_constants.cpp`	Adds non-scalar absolute FP16 roundtrip error detection and uses it to skip compression when the error is too large.
`src/common/transformations/tests/common_optimizations/compress_float_constants_test.cpp`	Adds tests verifying the new non-scalar absolute-error behavior (skip vs compress).

Copilot · 2026-03-24T10:27:59Z

...common/transformations/src/transformations/common_optimizations/compress_float_constants.cpp

@@ -176,14 +202,22 @@ CompressFloatConstantsImpl::CompressFloatConstantsImpl(bool postponed) {

        auto c_type = const_node->get_element_type();

-        // Skip FP16 compression for scalar constants with significant rounding error.
-        // Scalar constants often serve as mathematical scale factors (e.g., log(16) in attention
-        // bucketing) where FP16 rounding error cascades through every computation that uses them.
+        // Skip FP16 compression for constants with significant rounding error.
+        // Scalar: tight relative threshold (1e-4) — protects math scale factors (e.g. log(16)).
+        // Non-scalar: absolute threshold (1.0) — protects frequency tables (e.g. RoPE) where
+        // large values (>1024) lose significant precision in FP16 and the error compounds
+        // through iterative computations (e.g. 50-step denoising with CFG).
        if (ov::shape_size(const_node->get_shape()) == 1) {
            if (c_type == ov::element::f32 && scalar_has_high_f16_error<float>(*const_node))
                return false;
            if (c_type == ov::element::f64 && scalar_has_high_f16_error<double>(*const_node))
                return false;
+        } else {
+            constexpr double max_abs_error = 1.0;
+            if (c_type == ov::element::f32 && has_high_f16_abs_error<float>(*const_node, max_abs_error))
+                return false;
+            if (c_type == ov::element::f64 && has_high_f16_abs_error<double>(*const_node, max_abs_error))
+                return false;


[MEDIUM] has_high_f16_abs_error() adds an extra full scan over every non-scalar f32/f64 constant before the existing out-of-range scan + conversion. On x86 f32 this makes the pass do three O(N) passes (has_high_f16_abs_error + count_out_of_f16_range + convert_from_f32_to_f16_with_clamp), which can noticeably slow FP16 compression on large weight constants.

Consider folding the abs-error check into an existing loop (e.g., during conversion / range-counting) or adding a single helper that computes both “out-of-range count” and “max abs roundtrip error” in one traversal, so large constants aren’t re-scanned multiple times.

mryzhov requested a review from a team as a code owner March 24, 2026 10:20

github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Mar 24, 2026

mryzhov requested a review from Copilot March 24, 2026 10:23

mryzhov self-assigned this Mar 24, 2026

Copilot started reviewing on behalf of mryzhov March 24, 2026 10:24 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

moslex added this to the 2026.1 milestone Mar 25, 2026

moslex added the Code Freeze label Mar 25, 2026

Merge branch 'releases/2026/1' into backport/pr-34744-r2026.1

fdf4ff1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backport 2026.1] Skip FP16 compression for constants with high absolute roundtrip error#34885

[Backport 2026.1] Skip FP16 compression for constants with high absolute roundtrip error#34885
mryzhov wants to merge 2 commits intoopenvinotoolkit:releases/2026/1from
mryzhov:backport/pr-34744-r2026.1

mryzhov commented Mar 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mryzhov commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description:

Summary

Tickets:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mryzhov commented Mar 24, 2026 •

edited

Loading