optim: split factory registry and strat logics for factory api by art-test-stack · Pull Request #6 · art-test-stack/gpt-lab

art-test-stack · 2026-05-13T09:41:15Z

No description provided.

Copilot

Pull request overview

This PR refactors the optimizer implementation into a registry/strategy architecture, separating optimizer validation, local/distributed execution backends, concrete optimizer strategies, and fused kernels.

Changes:

Replaces the monolithic optimizer factory with registered optimizer specs and strategy-based AdamW/Muon execution.
Adds new optimizer kernel modules, including relocated AdamW/Muon kernels and additional Shampoo/Adahessian implementations.
Updates README optimization documentation to describe the new architecture and extension flow.

Reviewed changes

Copilot reviewed 7 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/gpt_lab/optim/factory.py`	Registers built-in optimizers and delegates stepping to local or distributed backends.
`src/gpt_lab/optim/registry.py`	Adds optimizer spec registry and parameter-group validation.
`src/gpt_lab/optim/strategy.py`	Adds scalar cache, strategy interface, and local/distributed backend orchestration.
`src/gpt_lab/optim/strategies/__init__.py`	Exports concrete optimizer strategies.
`src/gpt_lab/optim/strategies/adamw.py`	Adds AdamW local and distributed strategy implementation.
`src/gpt_lab/optim/strategies/muon.py`	Adds Muon local and distributed strategy implementation.
`src/gpt_lab/optim/kernels/adamw.py`	Adds compiled fused AdamW step kernel.
`src/gpt_lab/optim/kernels/muon.py`	Adds compiled fused Muon step kernel.
`src/gpt_lab/optim/kernels/shampoo.py`	Adds Shampoo optimizer implementation.
`src/gpt_lab/optim/kernels/adahessian.py`	Adds Adahessian optimizer implementation.
`src/gpt_lab/optim/kernels/__init__.py`	Adds optimizer kernels package marker.
`README.md`	Documents the registry/strategy optimizer architecture and usage examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+
+#### Adding a New Optimizer
+
+To add a new optimizer (e.g., Aurora), you only need to edit **one place**:


+The optimizer system uses a **registry-based, strategy-pattern architecture** that decouples optimizer logic from execution mode (single-GPU vs. distributed). Optimizers are built and configured by calling `DenseTransformer.build_optimizer` in the `Trainer` class (available in [`gpt_lab.train.trainer`](./src/gpt_lab/train/trainer.py)) using the optimizer configuration from [`configs/optim.yaml`](./configs/optim.yaml), which can specify mixed optimizer groups.
+
+> [!WARNING]
+> This is maybe the most critical part of the library, regarding model training, and it is also the part that I have less implemented myself. I used a lot of external repositories for code baseline, and used LLMs back and fourth to enhance it. My goal was to make it work, while being more modular. However, my comprehension of optimization algorithms, coupled with `torch.compile` and distributed training is quite limited. So, I encourage you to check the code in [`gpt_lab.optim.factory`](./src/gpt_lab/optim/factory.py) and the corresponding subfolders for the different optimizers.


+   optimizer:
+     - opt: aurora
+       lr: 1e-4
+       momentum: 0.9
+       weight_decay: 0.0


art-test-stack · 2026-05-28T08:00:59Z

@tanguyguyot
review plz

art-test-stack added 7 commits May 13, 2026 11:40

optim: split factory registry and strat logics for factory api

86ec12e

optim: readme adapted

de930b0

Merge branch 'master' of github.com:art-test-stack/gpt-lab into optim2

f504375

Merge branch 'master' of github.com:art-test-stack/gpt-lab into optim2

b623721

Merge branch 'master' of github.com:art-test-stack/gpt-lab into optim2

b0ca46a

Merge branch 'master' of github.com:art-test-stack/gpt-lab into optim2

14d9cf8

Merge branch 'master' of github.com:art-test-stack/gpt-lab into optim2

db23533

art-test-stack requested a review from Copilot May 27, 2026 15:17

Copilot started reviewing on behalf of art-test-stack May 27, 2026 15:18 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optim: split factory registry and strat logics for factory api#6

optim: split factory registry and strat logics for factory api#6
art-test-stack wants to merge 7 commits into
masterfrom
optim2

art-test-stack commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

art-test-stack commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		#### Adding a New Optimizer

		To add a new optimizer (e.g., Aurora), you only need to edit one place:

Conversation

art-test-stack commented May 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

art-test-stack commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants