Releases · PaddlePaddle/PaddleX · GitHub

22 Dec 02:40

Bobholamovic

v3.3.12 Latest

Latest

2025.12.17 v3.3.12 released

Fixed an issue where headers and footers were missing in the Markdown output of PP-StructureV3.
For PaddleOCR-VL-0.9B, optimized the flash attention availability check and fall back to eager attention when flash attention is unavailable (e.g., on Windows systems).

2025.12.17 v3.3.12 发布

修复 PP-StructureV3 的 Markdown 结果中缺失页眉、页脚的问题。
对于 PaddleOCR-VL-0.9B，优化 flash attention 可用性检查，在 Windows 系统等 flash attention 不可用的情况使用 eager attention 方案。

Full Changelog: v3.3.11...v3.3.12

Assets 2

09 Dec 12:40

Bobholamovic

v3.3.11

2025.12.9 v3.3.11 released

Fixed an issue where memory kept increasing during repeated inference.
Added a new parameter markdown_ignore_labels to PP-StructureV3 and PaddleOCR-VL, which controls the element types to be ignored in Markdown output; pipeline outputs now include additional information such as the number of pages and image size.
PaddleOCR-VL now supports controlling whether to merge image blocks through the merge_layout_blocks parameter.
Supported skipping model source checking via the environment variable PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK, resolving model loading lag in offline environments.
Fixed an accuracy issue of the RT-DETR-X model when using PIR-TRT inference with batch size > 1.
Pre-installed fonts in high-stability deployment images to prevent content loss when rendering PDF pages.
Added support for safetensors 0.7.0 and removed instructions in the documentation about installing a specific version.
Added support for MetaX GPUs. @metax666

2025.12.9 v3.3.11 发布

修复重复推理时内存持续增长的问题。
PP-StructureV3 与 PaddleOCR-VL 新增参数 markdown_ignore_labels，用于控制在 Markdown 输出中需忽略的元素类型；产线输出结果中新增文件页数、图像尺寸等信息。
PaddleOCR-VL 支持通过参数 merge_layout_blocks 控制是否对图像块进行合并。
支持通过环境变量 PADDLE_PDX_DISABLE_MODEL_SOURCE_CHECK 绕过模型源检查，解决离线环境下模型加载卡顿的问题。
修复 RT-DETR-X 模型在使用 PIR-TRT 推理且 batch size > 1 时出现的精度问题。
在高稳定性部署镜像中预装字体，避免渲染 PDF 页面时出现内容缺失。
支持 safetensors 0.7.0，并移除文档中关于安装指定版本的说明。
新增对沐曦 GPU 的支持。 @metax666

Full Changelog: v3.3.10...v3.3.11

Contributors

metax666

Assets 2

25 Nov 09:41

Bobholamovic

v3.3.10

2025.11.24 v3.3.10 released

Optimized the network implementation of the PaddleOCR-VL-0.9B model, significantly reducing GPU memory usage on devices with Compute Capability ≥ 8.

2025.11.24 v3.3.10 发布

优化 PaddleOCR-VL-0.9B 模型的组网实现，在 Compute Capability >= 8 的 GPU 设备上显著降低显存用量。

Full Changelog: v3.3.9...v3.3.10

Assets 2

25 Nov 09:41

Bobholamovic

v3.3.9

2025.11.10 v3.3.9 released

Fixed an issue where PP-DocLayoutV2 exhibited abnormal accuracy on CPU.
PaddleOCR-VL-0.9B now supports deployment on DCU devices using the vLLM server mode.

2025.11.10 v3.3.9 发布

修复 PP-DocLayoutV2 在 CPU 上精度异常的问题。
PaddleOCR-VL-0.9B 支持在 DCU 设备上以 vLLM server 方式部署。

Full Changelog: v3.3.8...v3.3.9

Assets 2

25 Nov 09:40

Bobholamovic

v3.3.8

2025.11.5 v3.3.8 released

Fixed installation bugs in the vLLM and SGLang plugins.

2025.11.5 v3.3.8 发布

修复 vLLM、SGLang 插件的安装 bug。

Full Changelog: v3.3.7...v3.3.8

Assets 2

25 Nov 09:39

Bobholamovic

v3.3.7

2025.11.5 v3.3.7 released

PaddleOCR-VL now supports inference on DCU and XPU.
Optimized the installation process for vLLM / SGLang plugins: hardware information is automatically detected and the matching version of flash-attn is installed without manual installation.
The high-stability serving solution for General OCR and PP-StructureV3 pipelines now supports handling concurrent requests in a single instance.
For high-stability serving, the server now prints the log IDs of each request in a batch when receiving requests, making debugging easier.
The PP-StructureV3 and PP-DocTranslation pipelines now support saving results in DOCX and LaTeX formats.
Simplified PDF page rendering logic to improve reading performance. @mara004

2025.11.5 v3.3.7 发布

PaddleOCR-VL 新增对 DCU 和 XPU 的推理支持。
优化 vLLM / SGLang 插件的安装流程：自动检测硬件信息并安装匹配版本的 flash-attn，无需手动安装。
通用 OCR 与 PP-StructureV3 产线的高稳定性服务化部署方案新增支持单实例并发请求处理。
对于高稳定性服务化部署，服务器在接收请求时新增打印 batch 内各请求的 log ID，便于调试。
PP-StructureV3、PP-DocTranslation 产线新增结果保存为 DOCX、LaTeX 格式的能力。
简化 PDF 页面渲染逻辑，提升读取性能。 @mara004

Full Changelog: v3.3.6...v3.3.7

Contributors

mara004

Assets 2

29 Oct 07:48

Bobholamovic

v3.3.6

2025.10.28 v3.3.6 released

PaddleOCR-VL supports inference using x86-64 CPU.
Unified the chat template used for different inference methods in PaddleOCR-VL.
Fixed the precision inconsistency issue for images containing formulas between PaddleOCR-VL inference using the PaddlePaddle framework and vLLM inference.
Released the Dockerfile for the vLLM inference image: https://github.com/PaddlePaddle/PaddleX/tree/release/3.3/deploy/genai_vllm_server_docker .
Fixed the issue where, during offline inference, the program still attempted to download models online even when local cached models existed.

2025.10.28 v3.3.6 发布

PaddleOCR-VL 支持使用 x86-64 CPU 推理。
统一 PaddleOCR-VL 不同推理方式使用的 chat template。
修复 PaddleOCR-VL 使用 PaddlePaddle 框架推理与使用 vLLM 推理对于包含公式的图像精度不一致的问题。
公开 vLLM 推理镜像的 Dockerfile：https://github.com/PaddlePaddle/PaddleX/tree/release/3.3/deploy/genai_vllm_server_docker 。
修复离线推理时，即使本地缓存模型存在，程序仍然尝试联网下载模型的问题。

Full Changelog: v3.3.5...v3.3.6

Assets 2

23 Oct 15:03

Bobholamovic

v3.3.5

2025.10.23 v3.3.5 released

Fixed the issue with weight data type mapping, supporting GPUs with compute capability between 7 and 8.
Resolved the problem of model configuration parsing failure when the model configuration includes quantization_config.
Fixed the issue where inference errors occurred when using paths containing Chinese characters in directories on Windows.
Resolved the problem of being unable to use PaddleOCR-VL models hosted on the AI Studio platform.
Added support for passing the max_new_tokens parameter during PaddleOCR-VL model inference.

2025.10.23 v3.3.5 发布

修复权重数据类型映射问题，支持 compute capability 在7-8之间的GPU。
修复模型配置中包含 quantization_config 时模型解析配置失败的问题。
修复 Windows 环境下，使用带有中文目录的路径推理报错的问题
修复无法使用 AI Studio 平台托管的 PaddleOCR-VL 模型的问题
支持 PaddleOCR-VL 模型推理时 max_new_tokens 参数的传入

Full Changelog: v3.3.4...v3.3.5

Assets 2

16 Oct 08:01

Bobholamovic

v3.3.1

2025.10.16 v3.3.1 released

Fix issues such as the missing concatenate_markdown_pages method in the PaddleOCR-VL production pipeline.

2025.10.16 v3.3.1 发布

修复 PaddleOCR-VL 产线 concatenate_markdown_pages 方法缺失等问题。

Full Changelog: v3.3.0...v3.3.1

Assets 2

16 Oct 05:50

Bobholamovic

v3.3.0

2025.10.16 v3.3.0 released

Added support for inference and deployment of PaddleOCR-VL and PP-OCRv5 multilingual models.

2025.10.16 v3.3.0 发布

支持PaddleOCR-VL、PP-OCRv5多语种模型的推理部署能力。

Full Changelog: v3.2.1...v3.3.0

Assets 2