Releases: keras-team/keras-hub
v0.26.0
New Models
- Translate Gemma: A multimodal variant of the Gemma 3 model fine-tuned for high-quality machine translation across 55 languages and capable to translate many more languages both directions, supporting both text and image inputs.
- SAM3 (Segment Anything Model 3): A next-generation computer vision model that introduces Promptable Concept Segmentation (PCS), allowing for precise object and concept segmentation through text or visual prompts.
- Qwen 2.5 Coder: A code-specialized version of the Qwen 2.5 series, optimized for programming tasks, debugging, and code generation across a wide variety of programming languages.
- Qwen 2.5 Math: A specialized variant of the Qwen 2.5 family designed for advanced mathematical reasoning, capable of solving complex problems with high precision.
- Qwen 3 Coder: An advanced coding Mixture-of-Experts model built on the Qwen 3 MoE architecture, delivering exceptional performance across both programming benchmarks and agentic tasks.
- RWKV7: A high-performance, fully recurrent (100% RNN) architecture featuring linear-time complexity and constant-space inference. By eliminating the need for a KV-cache and standard attention mechanisms.
Export to Safetensors
- Added Gemma3 Text models support for Safetensor export.
- Added Qwen Text models support for Safetensor export.
New Features
- Hugging Face Porting Script: Added an automated script to port any text-only decoder LLM from Hugging Face to the Keras Hub repository.
- AWQ Support: Added support for Activation-aware Weight Quantization (AWQ).
Bug Fixes and Improvements
- Python 3.13 Compatibility: Made
tensorflow-textan optional dependency to ensure compatibility with Python 3.13. - Masking: Fixed masking issues in
TokenAndPositionEmbeddingand improved compatibility with JAX. - Security: Fixed a safe mode bypass vulnerability in tokenizers.
- Numerical Stability: Fixed a float16 overflow issue in Gemma 3.
Contributors
We would like to thank our contributors for this release:
@Amitavoo, @amitsrivastava78, @divyashreepathihalli, @gaurides, @hertschuh, @james77777778, @jaytiwarihub, @JyotinderSingh, @kharshith-k, @LakshmiKalaKadali, @laxmareddyp, @mattdangerw, @nikolasavic3, @pass-lin, @pctablet505, @sachinprasadhs, @shashaka.
Full Changelog: v0.25.1...v0.26.0
v0.26.0.dev0
What's Changed
- Fix reversible embedding quantization by @pctablet505 in #2476
- Add FunctionGemma checkpoints to kerasHub by @laxmareddyp in #2480
- Models should reference ReversibleEmbedding from Keras core by @JyotinderSingh in #2482
- Revert "Models should reference ReversibleEmbedding from Keras core (… by @divyashreepathihalli in #2491
- Revert "Fix reversible embedding quantization" by @sachinprasadhs in #2487
- Fix caching in workflows. by @hertschuh in #2488
- update future dev verison by @sachinprasadhs in #2485
- update python version by @divyashreepathihalli in #2489
- An automated script to port any Text-only decoder LLM model from Hugging Face to Keras Hub repo by @laxmareddyp in #2497
- Add test file for convert_gpt_oss.py by @laxmareddyp in #2499
- Model Export to liteRT by @pctablet505 in #2405
- Fix broken MiTBackbone example in SegFormerBackbone docs by @shashaka in #2503
- Gemma3 text keras hf checkpoint conversion by @kharshith-k in #2433
- Use
subprocess.runinpip_build.pyto escape wheel path. by @hertschuh in #2509 - Fix masking in
TokenAndPositionEmbeddingand with JAX. by @hertschuh in #2510 - Add rqvae model by @divyashreepathihalli in #2490
- [Bugfix][Gemma3] Check if the input has image(s) before any image processing by @gaurides in #2508
- ADD RWKV7 by @pass-lin in #2421
- update non_max_supression.py by @pctablet505 in #2506
- Qwen keras model to HF safetensor format by @LakshmiKalaKadali in #2516
- Fix safe mode bypass vulnerability in tokenizers by @amitsrivastava78 in #2517
- Adds support for AWQ to use
get_quantization_layer_structurehooks by @JyotinderSingh in #2511 - Fix overflow issue in Gemma3 float16 by @divyashreepathihalli in #2519
- Update testcase.py by @pctablet505 in #2512
- Add EDRec by @divyashreepathihalli in #2514
- Enable newly released Med Gemma 1.5 4B variant version to hub by @laxmareddyp in #2524
- Remove keras-hub[nlp] note by @mattdangerw in #2531
- Doc: Fix parameter name typo in BertBackbone docstring by @jaytiwarihub in #2529
- skip test temporarily by @sachinprasadhs in #2539
- Downgrading transformers version to maintain compatibility with codebase by @kharshith-k in #2542
- Map new HF MedGemma model presets to conversion script by @laxmareddyp in #2541
- Fixes Flux model LiteRT export failures by reducing tensor dimensions and optimizing test model size. by @pctablet505 in #2515
- Add SAM3 Promptable Concept Segmentation (PCS) model by @james77777778 in #2534
- Register Qwen2.5-Coder presets by @Amitavoo in #2547
- fix perset convert by @pass-lin in #2556
- Fix exception handling for file errors by @pctablet505 in #2565
- Remove line break from print function call by @pctablet505 in #2564
- Update Kaggle Models link to Keras organization page by @laxmareddyp in #2560
- Perf: Pass None for missing images in Gemma3 to resolve TODO by @jaytiwarihub in #2535
- Add LiteRT support for SAM3 by @james77777778 in #2563
- Add Qwen 2.5 Math model presets and checkpoint conversion support by @laxmareddyp in #2566
- Temporary skip for failing litert test by @pctablet505 in #2568
- Add Qwen3 Coder preset and improve tokenizer robustness. by @laxmareddyp in #2567
- LiteRT export tests temporarily disbled by @pctablet505 in #2573
- Changes to the Gemma3 backbone for Embedding Gemma model by @laxmareddyp in #2536
- Add Gemma3 Embedding Model Presets to Hub by @laxmareddyp in #2558
- Revert "Perf: Pass None for missing images in Gemma3 to resolve TODO … by @divyashreepathihalli in #2576
- Revert "Add Gemma3 Embedding Model Presets to Hub" by @laxmareddyp in #2577
- Make tensorflow-text an optional dependency for Python 3.13 compatibility by @nikolasavic3 in #2528
- Add Translate Gemma checkpoint conversion by @sachinprasadhs in #2575
- Regitser translate gemma presets by @sachinprasadhs in #2578
- register SAM3 & RWKV presets by @sachinprasadhs in #2554
New Contributors
- @shashaka made their first contribution in #2503
- @kharshith-k made their first contribution in #2433
- @gaurides made their first contribution in #2508
- @LakshmiKalaKadali made their first contribution in #2516
- @jaytiwarihub made their first contribution in #2529
- @Amitavoo made their first contribution in #2547
Full Changelog: v0.25.1...v0.26.0.dev0
v0.25.1
What's Changed
- Fix float16 overflow in Gemma3 by addressing precision-related instabilities.
Full Changelog: v0.25.0...v0.25.1
v0.25.0
Summary:
New Models:
We've integrated new open-weight models to expand the capabilities of KerasHub, featuring specialized tools for function calling and safety, as well as high-performance open-source reasoning models:
- FunctionGemma: We have added support for
FunctionGemma, a lightweight model from Google built on theGemma 3270Marchitecture. Designed specifically for text-only function calling, this model is optimized for single-turn scenarios and deployment in resource-constrained environments. - GPT OSS: We have integrated OpenAI’s
gpt-ossfamily, including the20Band120Bparameter variants. These models utilize a Mixture-of-Experts (MoE) architecture with a128ktoken context window, optimized for STEM, coding, and general reasoning tasks. - GPT OSS Safeguard: A new open-weight safety reasoning model from OpenAI. Built upon the
GPT OSSarchitecture, it enables adaptable content classification and input-output filtering based on custom safety policies.
What's Changed
- Add Gemma3 Conversion script to port weights from HF directly by @laxmareddyp in #2445
- Fix ESM attention for TFLite compatibility by @pctablet505 in #2466
- Fix PARSeq decoder for TFLite compatibility by @pctablet505 in #2467
- Fix SAM tests to make it work with Keras master by @abheesht17 in #2469
- Generated GPT_OSS model files through porter script. by @laxmareddyp in #2384
- Register OpenAI GPT-OSS and GPT-OSS-SAFEGUARD Presets to kerashub. by @laxmareddyp in #2473
- Update latest synced models by @sachinprasadhs in #2475
- Version bump to 0.25.0.dev0 by @sachinprasadhs in #2478
- Version bump 0.25.0 by @sachinprasadhs in #2481
Full Changelog: v0.24.0...v0.25.0
v0.25.0.dev0
What's Changed
- Add Gemma3 Conversion script to port weights from HF directly by @laxmareddyp in #2445
- Fix ESM attention for TFLite compatibility by @pctablet505 in #2466
- Fix PARSeq decoder for TFLite compatibility by @pctablet505 in #2467
- Fix SAM tests to make it work with Keras master by @abheesht17 in #2469
- Generated GPT_OSS model files through porter script. by @laxmareddyp in #2384
- Register OpenAI GPT-OSS and GPT-OSS-SAFEGUARD Presets to kerashub. by @laxmareddyp in #2473
- Update latest synced models by @sachinprasadhs in #2475
- Version bump to 0.25.0.dev0 by @sachinprasadhs in #2478
Full Changelog: v0.24.0...v0.25.0.dev0
v0.24.0
Summary:
New Models:
We've integrated new models and presets to expand the capabilities of KerasHub:
- DINOv3: We have added the DINOv3 model architecture and registered its corresponding presets.
- MedGemma & MedSigLIP: New presets have been registered for MedGemma and MedSigLIP, bringing specialized capabilities for medical domain tasks.
- Qwen3 Embeddings: We have registered embedding presets for the Qwen3 model family.
Improvements & Enhancements
This update includes infrastructure improvements and fixes:
- GPTQ Quantization Hooks: Added
get_quantization_layer_structurehooks to facilitate GPTQ quantization support. - TensorFlow Compatibility: Fixed
tensorflow-textimports to ensure they do not break core TensorFlow functionality. - Gemini CLI Workflow: Introduced a new workflow to support co-working with the Gemini CLI.
What's Changed
- Fix tensorflow-text import to not break core tensorflow functionality by @nikolasavic3 in #2448
- Hf mirror sync for r 0.23 by @sachinprasadhs in #2451
- Set dev version to 0.24.0.dev0 by @sachinprasadhs in #2447
- Register MedGemma, MedSigLIP Presets to kerashub by @laxmareddyp in #2450
- Add DINOV3 with assistance from the Gemini CLI. by @james77777778 in #2444
- Add the workflow for co-working with the Gemini CLI. by @james77777778 in #2453
- Set JAX and Tensorflow GPU timeouts to 2.5 hours by @buildwithsuhana in #2439
- mark preset test to extra large to skip GPU testing by @sachinprasadhs in #2458
- add the default reviewer by @sachinprasadhs in #2460
- Register Qwen3 Embedding Presets to Kerashub by @laxmareddyp in #2455
- Add Presets,Checkpoint conversion for SmolLM3 model by @laxmareddyp in #2461
- Adds get_quantization_layer_structure hooks for GPTQ by @JyotinderSingh in #2462
- register dino v3 presets by @sachinprasadhs in #2463
- update release version by @sachinprasadhs in #2465
New Contributors
- @nikolasavic3 made their first contribution in #2448
Full Changelog: v0.23.0...v0.24.0
v0.24.0.dev0
What's Changed
- Fix tensorflow-text import to not break core tensorflow functionality by @nikolasavic3 in #2448
- Hf mirror sync for r 0.23 by @sachinprasadhs in #2451
- Set dev version to 0.24.0.dev0 by @sachinprasadhs in #2447
- Register MedGemma, MedSigLIP Presets to kerashub by @laxmareddyp in #2450
- Add DINOV3 with assistance from the Gemini CLI. by @james77777778 in #2444
- Add the workflow for co-working with the Gemini CLI. by @james77777778 in #2453
- Set JAX and Tensorflow GPU timeouts to 2.5 hours by @buildwithsuhana in #2439
- mark preset test to extra large to skip GPU testing by @sachinprasadhs in #2458
- add the default reviewer by @sachinprasadhs in #2460
- Register Qwen3 Embedding Presets to Kerashub by @laxmareddyp in #2455
- Add Presets,Checkpoint conversion for SmolLM3 model by @laxmareddyp in #2461
- Adds get_quantization_layer_structure hooks for GPTQ by @JyotinderSingh in #2462
- register dino v3 presets by @sachinprasadhs in #2463
New Contributors
- @nikolasavic3 made their first contribution in #2448
Full Changelog: v0.23.0...v0.24.0.dev0
v0.23.0
Summary:
New Models:
We've integrated a range of cutting-edge models, each designed to tackle specific challenges in their respective domains:
-
Cell2Sentence: A single-cell, biology-aware model built on the Gemma-2 architecture, designed to interpret complex biological data.
-
T5Gemma: A new encoder-decoder model, ideal for sequence-to-sequence tasks like translation and summarization.
-
PARSeq: An end-to-end, ViT-based model for scene text recognition (STR), excelling at reading text in natural images.
-
D-FINE: A high-performance, real-time object detection model.
-
DepthAnythingV2: A monocular depth estimation (MDE) model trained on a combination of synthetic labeled data and real-world unlabeled images.
-
Qwen3 Moe: The largest language model in the Qwen series, utilizing a Mixture-of-Experts (MoE) architecture for enhanced performance and efficiency.
-
MobileNetV5: A state-of-the-art vision encoder specifically designed for high-efficiency AI on edge devices.
-
SmolLM3: A compact yet powerful language model excelling in reasoning, long-context understanding, and multilingual capabilities.
Improvements & Enhancements
This update also includes several key improvements to enhance the platform's stability, compatibility, and flexibility:
export_to_transformers: You can now export trainable models, tokenizers, and configurations directly into the Hugging Face Transformers format usingexport_to_transformers. This feature is currently available for Gemma models, with support for more architectures coming soon.- OpenVINO Backend Support: We've integrated OpenVINO inference support, enabling optimized inference for Mistral, Gemma, and GPT-2 models.
- Bidirectional Attention Mask: Gemma models now support a bidirectional attention mask, enabling more effective fine-tuning on tasks that require understanding the full context of a sequence.
- CLIP & SD3 Model Refactor: The CLIP and Stable Diffusion 3 models have been refactored to improve numerical stability. Updated checkpoints are now available to ensure seamless and reliable performance.
What's Changed
- Register tiny Gemma presets by @sachinprasadhs in #2360
- Update fixed preset version for gemma3 by @sachinprasadhs in #2362
- Add generic export_to_transformers to the base classes by @Bond099 in #2346
- update version file in master by @sachinprasadhs in #2361
- add styleguide for GCA code reviews by @divyashreepathihalli in #2366
- Update styleguide.md by @divyashreepathihalli in #2370
- Add T5Gemma to KerasHub by @harshaljanjani in #2339
- Allow passing flexible positions to positional embedding layers by @abheesht17 in #2369
- Supports Loading Quantized Models with
from_preset()by @JyotinderSingh in #2367 - PARSeq Model by @sineeli in #2089
- Add D-FINE to KerasHub by @harshaljanjani in #2318
- Fixing dtype issue by @buildwithsuhana in #2372
- quantize(...) should accept a config object by @JyotinderSingh in #2388
- [OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 by @Mohamed-Ashraf273 in #2350
- minor modify by @pass-lin in #2386
- Add bidirectional attention mask for EmbeddingGemma by @abheesht17 in #2382
- Fixes by @buildwithsuhana in #2395
- Disable DINO quantisation checks by @abheesht17 in #2397
- Introduce D-FINE model presets in KerasHub by @harshaljanjani in #2376
- Introduce T5Gemma model presets in KerasHub by @harshaljanjani in #2373
- Update CLIP presets by @abheesht17 in #2400
- Fix Gemma OpenVINO tests by @abheesht17 in #2402
- Adds support for gemma_270m to checkpoint converter by @JyotinderSingh in #2380
- [internal] Reorder @pytest.mark.large decorator to fix CI by @JyotinderSingh in #2410
- Update preset map for VGG model by @sonali-kumari1 in #2411
- Update preset map for T5 model by @sonali-kumari1 in #2414
- Update preset map values for cspnet by @dhantule in #2416
- Add DepthAnythingV2. by @james77777778 in #2377
- Add Qwen3 Moe by @kanpuriyanawab in #2260
- update hf checkpoints list by @sachinprasadhs in #2381
- Patch conversion script qwen3 moe by @kanpuriyanawab in #2425
- update SD3 & 3.5 presets by @sachinprasadhs in #2417
- Add and Register the Qwen3_MoE Presets to Hub by @laxmareddyp in #2429
- Add MobileNetV5 to KerasHub by @harshaljanjani in #2399
- For sharded weights let's not delete explicitly by @amitsrivastava78 in #2431
- Update Keras min Test version to 3.9 by @sachinprasadhs in #2434
- Overrides
_post_quantizeto resetgenerate_functiongraph after quantization by @JyotinderSingh in #2436 - Handles incompatible quantization mode for ReversibleEmbedding by @JyotinderSingh in #2435
- extend PR stale and closure time by @sachinprasadhs in #2437
- register depth anything presets by @sachinprasadhs in #2420
- [SmolLM3] Add Backbone, CausalLM + Converter for HuggingFace Weights by @DavidLandup0 in #2327
- Register Cell2Sentence Presets by @laxmareddyp in #2442
- register parseq preset by @sachinprasadhs in #2438
- register mobilenet presets by @sachinprasadhs in #2443
- update release version by @sachinprasadhs in #2446
New Contributors
- @buildwithsuhana made their first contribution in #2372
- @Mohamed-Ashraf273 made their first contribution in #2350
- @dhantule made their first contribution in #2416
- @amitsrivastava78 made their first contribution in #2431
Full Changelog: v0.22.2...v0.23.0
v0.23.0.dev0
What's Changed
- Register tiny Gemma presets by @sachinprasadhs in #2360
- Update fixed preset version for gemma3 by @sachinprasadhs in #2362
- Add generic export_to_transformers to the base classes by @Bond099 in #2346
- update version file in master by @sachinprasadhs in #2361
- add styleguide for GCA code reviews by @divyashreepathihalli in #2366
- Update styleguide.md by @divyashreepathihalli in #2370
- Add T5Gemma to KerasHub by @harshaljanjani in #2339
- Allow passing flexible positions to positional embedding layers by @abheesht17 in #2369
- Supports Loading Quantized Models with
from_preset()by @JyotinderSingh in #2367 - PARSeq Model by @sineeli in #2089
- Add D-FINE to KerasHub by @harshaljanjani in #2318
- Fixing dtype issue by @buildwithsuhana in #2372
- quantize(...) should accept a config object by @JyotinderSingh in #2388
- [OpenVINO backend] Adding support for OpenVINO backend & support inference for Mistral & Gemma & GPT2 by @Mohamed-Ashraf273 in #2350
- minor modify by @pass-lin in #2386
- Add bidirectional attention mask for EmbeddingGemma by @abheesht17 in #2382
- Fixes by @buildwithsuhana in #2395
- Disable DINO quantisation checks by @abheesht17 in #2397
- Introduce D-FINE model presets in KerasHub by @harshaljanjani in #2376
- Introduce T5Gemma model presets in KerasHub by @harshaljanjani in #2373
- Update CLIP presets by @abheesht17 in #2400
- Fix Gemma OpenVINO tests by @abheesht17 in #2402
- Adds support for gemma_270m to checkpoint converter by @JyotinderSingh in #2380
- [internal] Reorder @pytest.mark.large decorator to fix CI by @JyotinderSingh in #2410
- Update preset map for VGG model by @sonali-kumari1 in #2411
- Update preset map for T5 model by @sonali-kumari1 in #2414
- Update preset map values for cspnet by @dhantule in #2416
- Add DepthAnythingV2. by @james77777778 in #2377
- Add Qwen3 Moe by @kanpuriyanawab in #2260
- update hf checkpoints list by @sachinprasadhs in #2381
- Patch conversion script qwen3 moe by @kanpuriyanawab in #2425
- update SD3 & 3.5 presets by @sachinprasadhs in #2417
- Add and Register the Qwen3_MoE Presets to Hub by @laxmareddyp in #2429
- Add MobileNetV5 to KerasHub by @harshaljanjani in #2399
- For sharded weights let's not delete explicitly by @amitsrivastava78 in #2431
- Update Keras min Test version to 3.9 by @sachinprasadhs in #2434
- Overrides
_post_quantizeto resetgenerate_functiongraph after quantization by @JyotinderSingh in #2436 - Handles incompatible quantization mode for ReversibleEmbedding by @JyotinderSingh in #2435
- extend PR stale and closure time by @sachinprasadhs in #2437
- register depth anything presets by @sachinprasadhs in #2420
- [SmolLM3] Add Backbone, CausalLM + Converter for HuggingFace Weights by @DavidLandup0 in #2327
- Register Cell2Sentence Presets by @laxmareddyp in #2442
- register parseq preset by @sachinprasadhs in #2438
- register mobilenet presets by @sachinprasadhs in #2443
New Contributors
- @buildwithsuhana made their first contribution in #2372
- @Mohamed-Ashraf273 made their first contribution in #2350
- @dhantule made their first contribution in #2416
- @amitsrivastava78 made their first contribution in #2431
Full Changelog: v0.22.2...v0.23.0.dev0
v0.22.2
New Model: VaultGemma
VaultGemma is a 1-billion-parameter, 26-layer, text-only decoder model trained with sequence-level differential privacy (DP).
Derived from Gemma 2, its architecture notably drops the norms after the Attention and MLP blocks and uses full attention for all layers, rather than alternating with local sliding attention.
The pretrained model is available with a 1024-token sequence length.
What's Changed
- Add DP research model by @sachinprasadhs in #2396
Full Changelog: v0.22.1...v0.22.2