[Feature] Verify and Support for DeepSeek-OCR-2#7665
[Feature] Verify and Support for DeepSeek-OCR-2#7665liutianyang-2026 wants to merge 10 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>
Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>
… deepseek models Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>
Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>
Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates DeepSeek-OCR-2 into the vllm-ascend framework, enabling its use for advanced optical character recognition and document understanding on Ascend hardware. It addresses critical stability and compatibility issues encountered during the model's bring-up, such as graph compilation errors due to specific model configurations and conflicts with newer Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Code Review
This pull request adds support for the DeepSeek-OCR-2 model, including a new tutorial document, an end-to-end test configuration, and a critical patch to resolve LlamaFlashAttention2 import errors in newer transformers versions. A robustness check for rope_dim was also added to the QKNorm and Rope fusion pass. Feedback includes correcting the trust_remote_code parameter to True in the tutorial's Python example and fixing a mislabeled code block from python to bash for proper syntax highlighting.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Local Test Results
|
SKILL: Adapt DeepSeek-OCR-2 (VLM) for vllm-ascendSkill Overview
Workflows & Best Practices1. Process Isolation for Remote Code Loading
2. Upstream Dependency Patching via Platform Hooks
3. Handling Anomalous Model Configurations in Graph Compilation
4. Efficient Client-Server Debugging & VLM Evaluation
Effective Prompts for AI Collaboration
|
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: liutianyang-2026 <liutianyang@isrc.iscas.ac.cn>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: liutianyang-2026 <liutianyang@isrc.iscas.ac.cn>
Signed-off-by: liutianyang-2026 <liutianyang@isrc.iscas.ac.cn>
…heck Signed-off-by: Tianyang Liu <liutianyang@isrc.iscas.ac.cn>





What this PR does / why we need it?
This PR adds and validates DeepSeek-OCR-2 support in vllm-ascend, and includes compatibility/stability fixes found during bring-up.
Main changes:
QKNormRopeFusionPasswhenrope_dim <= 0.max_position_embeddings=0, which can lead torope_dim=0.transformers, and fix lm-eval failures on DeepSeek models.LlamaFlashAttention2(though as a fallback), which is deprecated in latertransformersversions.tests/e2e/models/configs/DeepSeek-OCR-2.yamlWhy needed:
Does this PR introduce any user-facing change?
Yes.
transformers, lm-eval path, graph compilation edge case).How was this patch tested?
Added e2e model config:
tests/e2e/models/configs/DeepSeek-OCR-2.yamlRan CI/lint alignment in follow-up commit (
apply ci & lint).vLLM version: v0.18.0
vLLM main: vllm-project/vllm@35141a7