Skip to content

[Backport 2026.1] Skip FP16 compression for constants with high absolute roundtrip error#34885

Open
mryzhov wants to merge 2 commits intoopenvinotoolkit:releases/2026/1from
mryzhov:backport/pr-34744-r2026.1
Open

[Backport 2026.1] Skip FP16 compression for constants with high absolute roundtrip error#34885
mryzhov wants to merge 2 commits intoopenvinotoolkit:releases/2026/1from
mryzhov:backport/pr-34744-r2026.1

Conversation

@mryzhov
Copy link
Contributor

@mryzhov mryzhov commented Mar 24, 2026

Description:

Summary

Extend the scalar FP16 error check from PR #34110 to non-scalar constants
Add max absolute roundtrip error threshold (1.0) to CompressFloatConstantsImpl
Out-of-range values are excluded from the check (handled separately by the 75% threshold with clamping)
Details
compress_float_constants.cpp already skips FP16 compression for scalar constants with high relative roundtrip error (PR #34110). However, non-scalar constants with large values (>1024) can have absolute FP16 error exceeding 1.0 due to limited mantissa resolution, and these were not checked.

This caused accuracy degradation in LTX-Video FP16 export: a RoPE cosine frequency table (341 elements, values up to ~31416, max FP16 abs error 7.93) was compressed to FP16 and then applied multiplicatively to Q/K vectors across 28 transformer blocks in a 50-step denoising loop, compounding the error. WWB similarity improved from 0.831 to 0.956 (FP32 baseline: 0.984).

The threshold of 1.0 is chosen because FP16 ULP reaches 1.0 only at value range [1024, 2048]. Normal neural network weights (typically in [-10, 10]) have max absolute error ~0.005, so no false positives are expected. Size impact is negligible — only a few small frequency/scale constants per model stay in FP32.

Tickets:

  • 180611

Extend the existing scalar error check (PR openvinotoolkit#34110) to non-scalar
constants. Add max absolute FP16 roundtrip error threshold (1.0) that
protects constants like RoPE frequency tables where large values
(>1024) lose significant precision in FP16. Out-of-range values are
excluded from the check as they are already handled by the 75%
threshold with clamping.

This fixes catastrophic accuracy degradation in LTX-Video FP16 export
(WWB similarity 0.831 -> 0.956) caused by a RoPE cosine frequency
table (341 elements, max FP16 abs error 7.93) being compressed to
FP16. The corrupted positional encoding compounded through 28 blocks
x 50 denoising steps x 2 CFG passes.

CVS-180611

(cherry picked from commit 18e681d)
@mryzhov mryzhov requested a review from a team as a code owner March 24, 2026 10:20
@github-actions github-actions bot added the category: transformations OpenVINO Runtime library - Transformations label Mar 24, 2026
@mryzhov mryzhov requested a review from Copilot March 24, 2026 10:23
@mryzhov mryzhov self-assigned this Mar 24, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backport that improves CompressFloatConstants accuracy by skipping FP16 compression for non-scalar FP32/FP64 constants whose FP16 roundtrip introduces large absolute error, extending the earlier scalar-only relative-error safeguard (PR #34110).

Changes:

  • Add an absolute FP16 roundtrip error check for non-scalar constants (threshold: 1.0) to decide whether to skip compression.
  • Keep existing scalar relative-error check (1e-4) unchanged and apply it only to numel == 1.
  • Add regression tests covering both skipping (high abs error) and compressing (low abs error) non-scalar constants.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/common/transformations/src/transformations/common_optimizations/compress_float_constants.cpp Adds non-scalar absolute FP16 roundtrip error detection and uses it to skip compression when the error is too large.
src/common/transformations/tests/common_optimizations/compress_float_constants_test.cpp Adds tests verifying the new non-scalar absolute-error behavior (skip vs compress).

Comment on lines 160 to +220
@@ -176,14 +202,22 @@ CompressFloatConstantsImpl::CompressFloatConstantsImpl(bool postponed) {

auto c_type = const_node->get_element_type();

// Skip FP16 compression for scalar constants with significant rounding error.
// Scalar constants often serve as mathematical scale factors (e.g., log(16) in attention
// bucketing) where FP16 rounding error cascades through every computation that uses them.
// Skip FP16 compression for constants with significant rounding error.
// Scalar: tight relative threshold (1e-4) — protects math scale factors (e.g. log(16)).
// Non-scalar: absolute threshold (1.0) — protects frequency tables (e.g. RoPE) where
// large values (>1024) lose significant precision in FP16 and the error compounds
// through iterative computations (e.g. 50-step denoising with CFG).
if (ov::shape_size(const_node->get_shape()) == 1) {
if (c_type == ov::element::f32 && scalar_has_high_f16_error<float>(*const_node))
return false;
if (c_type == ov::element::f64 && scalar_has_high_f16_error<double>(*const_node))
return false;
} else {
constexpr double max_abs_error = 1.0;
if (c_type == ov::element::f32 && has_high_f16_abs_error<float>(*const_node, max_abs_error))
return false;
if (c_type == ov::element::f64 && has_high_f16_abs_error<double>(*const_node, max_abs_error))
return false;
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] has_high_f16_abs_error() adds an extra full scan over every non-scalar f32/f64 constant before the existing out-of-range scan + conversion. On x86 f32 this makes the pass do three O(N) passes (has_high_f16_abs_error + count_out_of_f16_range + convert_from_f32_to_f16_with_clamp), which can noticeably slow FP16 compression on large weight constants.

Consider folding the abs-error check into an existing loop (e.g., during conversion / range-counting) or adding a single helper that computes both “out-of-range count” and “max abs roundtrip error” in one traversal, so large constants aren’t re-scanned multiple times.

Copilot uses AI. Check for mistakes.
@moslex moslex added this to the 2026.1 milestone Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: transformations OpenVINO Runtime library - Transformations Code Freeze

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants