Skip to content

Add Qwen 3 Omni#2590

Open
samudraneel05 wants to merge 54 commits intokeras-team:masterfrom
samudraneel05:qwen-3-omni
Open

Add Qwen 3 Omni#2590
samudraneel05 wants to merge 54 commits intokeras-team:masterfrom
samudraneel05:qwen-3-omni

Conversation

@samudraneel05
Copy link

@samudraneel05 samudraneel05 commented Feb 8, 2026

Description of the change

Add Qwen3-Omni model to Keras-Hub. It has a thinker-talker architecture So far, I've tried to implement the thinker component, which is the core text transformer with audio and vision encoder integration.

Reference

Fixes #2413, Fixes #2523, and fixes #2530

Colab Notebook

Checklist

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have followed the Keras Hub Model contribution guidelines in making these changes.
  • I have followed the Keras Hub API design guidelines in making these changes.
  • I have signed the Contributor License Agreement.

I have some follow-up questions, which I'll elaborate on in the comments under this.

@samudraneel05 samudraneel05 changed the title Qwen 3 omni Add Qwen 3 Omni Feb 8, 2026
@sachinprasadhs sachinprasadhs added the new model For PRs that contribute a new model to the Keras Hub registry. label Feb 9, 2026
Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I took a look at a few files and left comments. Please address those comments and mark them as resolved once complete. I will perform another review after the comments have been handled.

@samudraneel05
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and well-structured contribution, adding the multimodal Qwen3-Omni model to KerasHub. The code adheres well to the repository's extensive style guide, including modular components, comprehensive tests, and the necessary converters. I have one main piece of feedback regarding code duplication and a related bug in the Qwen3OmniBackbone implementation, which I've detailed in a specific comment. Overall, this is a high-quality pull request.

@samudraneel05
Copy link
Author

hi, i've addressed the current set of comments. ready for review @sachinprasadhs!

Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the comments, overall it looks good.
Few small comments and would it be possible to attach the screenshots of numerics matching with 1e-3 level.
Also, add one usage example notebook for different types of input and expected output format.

@sachinprasadhs
Copy link
Collaborator

/gemini review

@sachinprasadhs sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Mar 8, 2026
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 8, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Qwen3Omni multimodal model, a significant addition to the library. The implementation is comprehensive, covering the backbone, audio and vision encoders, preprocessors, and the causal language modeling task. The code is well-structured and adheres to the repository's conventions for new model contributions. I have identified a few areas for improvement related to performance and style, which are detailed in the review comments. These include an opportunity to optimize the application of RoPE during cached generation and to make the creation of positional embeddings in the audio encoder more efficient. I also noted a minor deviation from the style guide regarding attribute naming. Overall, this is a well-executed contribution.

@sachinprasadhs sachinprasadhs added the kokoro:force-run Runs Tests on GPU label Mar 9, 2026
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new model For PRs that contribute a new model to the Keras Hub registry.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Qwen3-Omni to Hub Add vLLM-qwen3-omni-model Add Qwen3-Omni Model

3 participants