Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix(grpo-trainer): init self.args before use
#4801 opened Jan 9, 2026 by carlyou Loading…
1 of 5 tasks
Remove DbrxForCausalLM support
#4799 opened Jan 9, 2026 by qgallouedec Loading…
Updat examples to new OpenEnv version
#4796 opened Jan 9, 2026 by sergiopaniego Draft
5 tasks
forward_masked_logits in SFTTrainer
#4794 opened Jan 8, 2026 by qgallouedec Draft
5 tasks
fix xpu vllm client server
#4780 opened Jan 7, 2026 by jiqing-feng Loading…
Set dtype default to float32
#4778 opened Jan 6, 2026 by albertvillanova Loading…
Add reward shaping to PPOTrainer
#4774 opened Jan 5, 2026 by derivative2002 Loading…
5 tasks
make dpo compatible with qwen3vl
#4773 opened Jan 4, 2026 by flutist Loading…
Add a config to limit the number of tool calling iterations.
#4761 opened Dec 29, 2025 by pramodith Loading…
4 of 5 tasks
Extend CLI to orpo trainer
#4757 opened Dec 27, 2025 by murilo-cunha Loading…
3 of 5 tasks
fix: handle None eval_dataset in example code
#4756 opened Dec 27, 2025 by ciaoyizhen Loading…
1 of 4 tasks
perf: avoid output_hidden_states when only last_hidden_state is used
#4755 opened Dec 27, 2025 by ciaoyizhen Loading…
2 of 5 tasks
vllm parameter passthrough for stop sequences
#4754 opened Dec 26, 2025 by kdubovikov Loading…
Clarify Accelerate usage in SFTTrainer documentation
#4744 opened Dec 23, 2025 by Likhita-17 Loading…
1 task done
fix minillm trainer
#4743 opened Dec 23, 2025 by t1101675 Loading…
5 tasks
[GRPOTrainer]: Agent Training Supports Async Tool Calls
#4742 opened Dec 23, 2025 by pramodith Loading…
5 tasks done
ProTip! no:milestone will show everything without a milestone.