Fix ERNIE 4.5 model builder: rope_attrs and config architecture name (#2007)

xiaoyao9184 · web-flow · commit 3ee4d488583f · 2026-03-05T10:06:42.000Z
## Description This PR fixes the ERNIE 4.5 model builder to align with upstream renames and the current codebase. ### 1. Use `rope_attrs` instead of `rotemb_attrs` In [b92970c](b92970c) (Add OpenAI's gpt-oss, #1678), `rotemb_attrs` was replaced by `rope_attrs` across the codebase. In [PR #1862](#1862) (Model builder refactoring), the shared `Model` class and its `rope_attrs` were moved to `builders/base.py`. The ERNIE builder in `builders/ernie.py` was still referring to the old attribute name. This PR updates `ErnieModel` to use `rope_attrs` (inherited from the base via `MistralModel`) for: - **Interleaved RoPE**: `self.rope_attrs["interleaved"] = 1` - **Compression ratio (RoPE scaling)**: `self.rope_attrs["rescale_factors"] = 1.0 / config.compression_ratio` when `compression_ratio` is set ### 2. Match updated Hugging Face config `architectures` The official ERNIE 4.5 config on Hugging Face was updated: `architectures` was changed from `Ernie4_5_ForCausalLM` to `Ernie4_5ForCausalLM` (underscore removed before `ForCausalLM`): - [baidu/ERNIE-4.5-0.3B-Base-PT](https://huggingface.co/baidu/ERNIE-4.5-0.3B-Base-PT/commit/bf9499229d96e16442fd63992195b7369c7b2657) - [baidu/ERNIE-4.5-0.3B-PT](https://huggingface.co/baidu/ERNIE-4.5-0.3B-PT/commit/018ae39b66b9d73e17e9092434de6acd4dd4856a) This PR updates the architecture check in `builder.py` from `Ernie4_5_ForCausalLM` to `Ernie4_5ForCausalLM` so that the builder correctly recognizes current ERNIE 4.5 models.
diff --git a/src/python/py/models/builder.py b/src/python/py/models/builder.py
@@ -213,7 +213,7 @@ def create_model(
         config.hidden_act = "swiglu"
         onnx_model = ChatGLMModel(config, io_dtype, onnx_dtype, execution_provider, cache_dir, extra_options)
         onnx_model.model_type = "chatglm"
-    elif config.architectures[0] == "Ernie4_5_ForCausalLM":
+    elif config.architectures[0] == "Ernie4_5ForCausalLM":
         onnx_model = ErnieModel(config, io_dtype, onnx_dtype, execution_provider, cache_dir, extra_options)
     elif config.architectures[0] == "GemmaForCausalLM":
         onnx_model = GemmaModel(config, io_dtype, onnx_dtype, execution_provider, cache_dir, extra_options)
diff --git a/src/python/py/models/builders/ernie.py b/src/python/py/models/builders/ernie.py
@@ -11,10 +11,10 @@ def __init__(self, config, io_dtype, onnx_dtype, ep, cache_dir, extra_options):
         super().__init__(config, io_dtype, onnx_dtype, ep, cache_dir, extra_options)
 
         # Ernie uses interleaved rotary position embeddings.
-        self.rotemb_attrs["interleaved"] = 1
+        self.rope_attrs["interleaved"] = 1
 
         # Ernie uses a `compression_ratio` for its RoPE scaling.
         # The original RoPE logic in ernie is: position_ids / compression_ratio,
         # which is equivalent to scaling the frequencies (inv_freq) by 1 / compression_ratio.
         if hasattr(config, "compression_ratio") and config.compression_ratio != 1.0:
-            self.rotemb_attrs["rescale_factors"] = 1.0 / config.compression_ratio
+            self.rope_attrs["rescale_factors"] = 1.0 / config.compression_ratio