Vectorize scatter operation in NumPy backend by 0xRozier · Pull Request #22218 · keras-team/keras

0xRozier · 2026-02-19T13:54:49Z

Summary

Replace the Python for loop in scatter() with NumPy's vectorized np.add.at, yielding ~87x speedup for large-scale scatter operations (e.g. 10^6 updates)
The change is minimal (3 lines removed, 2 added) and follows the same pattern already used by scatter_update() in the same file

Details

The current implementation iterates through each index with a Python loop:

for i in range(indices.shape[0]):
    index = indices[i]
    zeros[tuple(index)] += values[i]

This bypasses NumPy's internal C-optimized loops. The fix replaces it with:

idx = tuple(indices.T)
np.add.at(zeros, idx, values)

np.add.at correctly handles duplicate indices via cumulative addition, maintaining full compatibility with existing behavior.

Benchmark (1M updates on a 1000x1000 array):

Before: ~3.86s
After: ~0.04s

Fixes #22208

Test plan

All 6 existing scatter-related tests pass (CoreOpsDynamicShapeTest, CoreOpsStaticShapeTest, CoreOpsCorrectnessTest)
Verified duplicate index accumulation works correctly
Verified 1D, 2D, 3D, and slice scatter cases

google-cla · 2026-02-19T13:55:05Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

gemini-code-assist · 2026-02-19T13:55:54Z

Summary of Changes

Hello @0xRozier, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of the scatter operation within the NumPy backend by replacing a Python for loop with NumPy's vectorized np.add.at function. This change dramatically improves execution speed for large-scale data manipulations, ensuring more efficient tensor operations without altering existing behavior or compatibility.

Highlights

Performance Improvement: Replaced the Python for loop in the scatter() function with NumPy's vectorized np.add.at, yielding an ~87x speedup for large-scale scatter operations (e.g., 10^6 updates).
Code Simplification: Achieved significant performance gains with minimal code changes (3 lines removed, 2 added), aligning with the pattern already used by scatter_update() in the same file.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

keras/src/backend/numpy/core.py
- Replaced the iterative for loop implementation of scatter with a vectorized np.add.at call.

Activity

Addressed issue [Performance] Vectorize scatter operation in NumPy backend #22208.
Verified all 6 existing scatter-related tests passed, including CoreOpsDynamicShapeTest, CoreOpsStaticShapeTest, and CoreOpsCorrectnessTest.
Confirmed correct handling of duplicate index accumulation.
Validated 1D, 2D, 3D, and slice scatter cases.
The pull request was generated with Claude Code.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This is an excellent change that significantly improves the performance of the scatter operation in the NumPy backend. By replacing the Python for loop with the vectorized np.add.at function, the implementation becomes much more efficient, idiomatic, and concise. The change correctly maintains the behavior of accumulating values for duplicate indices and aligns well with similar patterns found in scatter_update within the same file. The performance gains described are substantial, making this a valuable optimization.

codecov-commenter · 2026-02-19T14:04:37Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.90%. Comparing base (1a0655b) to head (12a6f80).
⚠️ Report is 75 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #22218      +/-   ##
==========================================
+ Coverage   82.89%   82.90%   +0.01%     
==========================================
  Files         593      594       +1     
  Lines       64169    65843    +1674     
  Branches    10073    10292     +219     
==========================================
+ Hits        53192    54589    +1397     
- Misses       8385     8638     +253     
- Partials     2592     2616      +24

Flag	Coverage Δ
keras	`82.73% <100.00%> (+0.01%)`	⬆️
keras-jax	`60.94% <0.00%> (-1.10%)`	⬇️
keras-numpy	`55.11% <100.00%> (-1.07%)`	⬇️
keras-openvino	`49.09% <0.00%> (+11.22%)`	⬆️
keras-tensorflow	`62.16% <0.00%> (-1.14%)`	⬇️
keras-torch	`61.01% <0.00%> (-1.11%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Replace the Python for-loop in `scatter()` with NumPy's `np.add.at` for vectorized index accumulation. This yields ~87x speedup for large-scale scatter operations (e.g. 10^6 updates). Fixes keras-team#22208

Sort the axis list in `RMSNormalization.build()` and in `_rms_normalization()` so that unsorted axes like `[-1, -2]` produce the same `normalized_shape` and scale shape as `[-2, -1]`. Adds a test covering unsorted contiguous axes.

0xRozier · 2026-02-19T22:29:25Z

@hertschuh, if you can take a quick look, it would be great (take your time tho, I'm not in a rush)

hertschuh · 2026-02-20T18:57:24Z

There seems to be an unrelated fix with RMS normalization, should that be a separate PR?

0xRozier · 2026-03-02T09:27:33Z

You're right, the RMS normalization fix is unrelated. I can remove it from this PR and submit it as a separate one — let me know if you'd prefer that.

For context: it addresses a minor bug where passing unsorted axes (e.g. axis=[-1, -2]) to RMSNormalization produces an incorrect scale shape. Happy to open a dedicated issue and PR for it.

hertschuh · 2026-03-03T20:49:42Z

You're right, the RMS normalization fix is unrelated. I can remove it from this PR and submit it as a separate one — let me know if you'd prefer that.

For context: it addresses a minor bug where passing unsorted axes (e.g. axis=[-1, -2]) to RMSNormalization produces an incorrect scale shape. Happy to open a dedicated issue and PR for it.

Yes, please separate the RMSNormalization fix and remove it from this PR.

For one thing, there are already 2 other PRs addressing the same RMSNormalization issue.

…computation" This reverts commit 89021f4.

0xRozier · 2026-03-04T07:50:27Z

Done — I've removed the RMSNormalization fix from this PR. It now only contains the scatter vectorization change.

google-ml-butler bot added the size:XS label Feb 19, 2026

google-ml-butler bot assigned gbaned Feb 19, 2026

gemini-code-assist bot reviewed Feb 19, 2026

View reviewed changes

0xRozier added 2 commits February 19, 2026 18:45

perf: vectorize scatter operation in NumPy backend

9ca91bb

Replace the Python for-loop in `scatter()` with NumPy's `np.add.at` for vectorized index accumulation. This yields ~87x speedup for large-scale scatter operations (e.g. 10^6 updates). Fixes keras-team#22208

0xRozier force-pushed the fix/issue-22208-vectorize-scatter-numpy-backend branch from ab20f3a to 89021f4 Compare February 19, 2026 17:45

hertschuh added the stat:awaiting response from contributor label Feb 20, 2026

google-ml-butler bot removed the stat:awaiting response from contributor label Mar 2, 2026

hertschuh added the stat:awaiting response from contributor label Mar 3, 2026

Revert "fix: sort axis list in RMSNormalization for consistent shape …

12a6f80

…computation" This reverts commit 89021f4.

google-ml-butler bot removed the stat:awaiting response from contributor label Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize scatter operation in NumPy backend#22218

Vectorize scatter operation in NumPy backend#22218
0xRozier wants to merge 3 commits intokeras-team:masterfrom
0xRozier:fix/issue-22208-vectorize-scatter-numpy-backend

0xRozier commented Feb 19, 2026 •

edited

Loading

Uh oh!

google-cla bot commented Feb 19, 2026

Uh oh!

gemini-code-assist bot commented Feb 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

codecov-commenter commented Feb 19, 2026 •

edited

Loading

Uh oh!

0xRozier commented Feb 19, 2026

Uh oh!

hertschuh commented Feb 20, 2026

Uh oh!

0xRozier commented Mar 2, 2026

Uh oh!

hertschuh commented Mar 3, 2026

Uh oh!

0xRozier commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

0xRozier commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test plan

Uh oh!

google-cla bot commented Feb 19, 2026

Uh oh!

gemini-code-assist bot commented Feb 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

codecov-commenter commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

0xRozier commented Feb 19, 2026

Uh oh!

hertschuh commented Feb 20, 2026

Uh oh!

0xRozier commented Mar 2, 2026

Uh oh!

hertschuh commented Mar 3, 2026

Uh oh!

0xRozier commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

0xRozier commented Feb 19, 2026 •

edited

Loading

codecov-commenter commented Feb 19, 2026 •

edited

Loading