Fix OOM in CI by reducing image size of tiny Gemma3 model by albertvillanova · Pull Request #5680 · huggingface/trl

albertvillanova · 2026-04-29T13:50:02Z

Fix OOM in CI by reducing image size of tiny Gemma3 model.

This PR introduces a targeted adjustment for the google/gemma-3-4b-it model in the generate_tiny_models script to address memory usage issues related to image processing.

Partial fix for:

CI often fails with torch.OutOfMemoryError: CUDA out of memory #5207

Motivation

The tiny-Gemma3ForConditionalGeneration model was generated with the default SigLIP image size of 896×896, which produces 4,096 patches per image. During training, the vision encoder attention maps have shape [batch, heads, 4096, 4096], consuming ~1 GB per layer. With 2 vision layers and backpropagation, a single Gemma3 test consumes 5–7 GiB of GPU memory. Two such tests running concurrently on a 14.74 GiB GPU caused CUDA out-of-memory errors in all other parallel workers.

Solution

Override image_size=224 (256 patches) when generating the tiny Gemma3 model. This is consistent with mm_tokens_per_image=256 in the Gemma3 config: the projector's AvgPool2d gets kernel_size=1 (identity), which is architecturally valid. The processor's image processor size is updated to match so that test inputs are also resized to 224×224.

Changes

Model-specific configuration:

For the google/gemma-3-4b-it model, sets vision_config["image_size"] and processor.image_processor.size to 224x224 (instead of the default 896x896) to reduce memory consumption during training by limiting the number of image patches and ensuring the projector's average pooling layer acts as an identity function.

Note

Low Risk
Model-generation script change scoped to a single model ID; it only adjusts test image resolution/config and shouldn’t affect runtime code paths beyond tiny model artifacts.

Overview
Reduces memory usage for the generated tiny Gemma3 vision-language test model by overriding SigLIP image resolution when model_id == "google/gemma-3-4b-it".

scripts/generate_tiny_models.py now sets vision_config["image_size"] = 224 and aligns processor.image_processor.size to 224×224, cutting patch count and preventing CI GPU OOMs during Gemma3 training/tests.

^{Reviewed by Cursor Bugbot for commit 15c5aff. Bugbot is set up for automated code reviews on this repo. Configure here.}

HuggingFaceDocBuilderDev · 2026-04-29T13:52:47Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-04-29T14:28:06Z

thanks! can you just confirm with a forward + gpu peak memory measurement for old vs new?

albertvillanova · 2026-04-30T08:58:49Z

Thanks for your sensible suggestion: the difference is small. I continue investigating...

albertvillanova · 2026-04-30T09:13:03Z

I think the intermediate_size should also be reduced: from 4304 to 32.

Reduce image_size for tiny gemma3

15c5aff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix OOM in CI by reducing image size of tiny Gemma3 model#5680

Fix OOM in CI by reducing image size of tiny Gemma3 model#5680
albertvillanova wants to merge 1 commit intomainfrom
pfix-5207-tiny-gemma3

albertvillanova commented Apr 29, 2026 •

edited by cursor Bot

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 29, 2026

Uh oh!

qgallouedec commented Apr 29, 2026

Uh oh!

albertvillanova commented Apr 30, 2026

Uh oh!

albertvillanova commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

albertvillanova commented Apr 29, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

Changes

Uh oh!

HuggingFaceDocBuilderDev commented Apr 29, 2026

Uh oh!

qgallouedec commented Apr 29, 2026

Uh oh!

albertvillanova commented Apr 30, 2026

Uh oh!

albertvillanova commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

albertvillanova commented Apr 29, 2026 •

edited by cursor Bot

Loading