-
Notifications
You must be signed in to change notification settings - Fork 288
Pull requests: radixark/miles
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
chore: make megatron e2e CaseConfig topology explicit instead of inferred
run-ci-megatron
#1513
opened Jun 29, 2026 by
guapisolo
Collaborator
Loading…
Megatron e2e: weight-check skip-list, Qwen3.5 MTP cases
run-ci-qwen35
Run qwen3.5 e2e CI
#1512
opened Jun 29, 2026 by
guapisolo
Collaborator
Loading…
feat(ci): metric-history gate foundation — storage contract + collection backend (M0+M1)
#1511
opened Jun 29, 2026 by
guapisolo
Collaborator
Loading…
refactor: extract session core with direct HTTP responses
#1510
opened Jun 29, 2026 by
guapisolo
Collaborator
Loading…
fsdp: keep fp32 master for nemotron_h (mixed-dtype checkpoint)
#1502
opened Jun 29, 2026 by
Zhichenzzz
Contributor
Loading…
fsdp: clear stale GDN packing boundaries on non-packed forwards
#1501
opened Jun 29, 2026 by
Zhichenzzz
Contributor
Loading…
fsdp: force flash attention for attention-sink models (gpt-oss)
#1500
opened Jun 28, 2026 by
Zhichenzzz
Contributor
Loading…
fix(update_weight): skip flush_cache for retract pause mode
#1497
opened Jun 27, 2026 by
Shi-Dong
Contributor
Loading…
1 task
docs: fix cli-reference defaults and advantage-estimator choices
#1490
opened Jun 26, 2026 by
Shi-Dong
Contributor
Loading…
1 task
rocm: disable gradient_accumulation_fusion on gfx950 across e2e tests
#1489
opened Jun 26, 2026 by
sreerohi
Loading…
[OPD] Add Qwen3.5-35B-A3B single-node self-distillation example
#1488
opened Jun 26, 2026 by
maocheng23
Contributor
Loading…
fsdp(scripts): RL scripts for representative dense + MoE models
#1486
opened Jun 26, 2026 by
Zhichenzzz
Contributor
•
Draft
fix(megatron): propagate recompute config to bridge model so checkpointing engages
#1482
opened Jun 25, 2026 by
guapisolo
Collaborator
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.