Summary
Create a new training-methods skill that documents all 8 stable training methods with their data requirements, use cases, and recommended training order.
Context
ReAlign supports 8 stable training methods but there's no centralized skill to help users choose the right method for their goal.
What Does NOT Work
- No centralized skill documenting all training methods
- Users don't know which method to use for their goal
- Data format requirements scattered across code
- Training order not documented
Implementation Approach
Create .claude/skills/training-methods.md documenting:
8 Stable Training Methods
| Method |
Purpose |
Data Format |
Use Case |
| SFT/LoRA |
Base capabilities |
instruction + output |
Knowledge injection |
| DPO |
Preference alignment |
chosen + rejected |
Behavior change |
| ORPO |
Preference (simple) |
chosen + rejected |
No reference model needed |
| GRPO |
Verifiable rewards |
prompt + scored responses |
Math/code verification |
| CPO |
Conservative preference |
chosen + rejected |
Reduce distribution shift |
| RLVR |
Verified rewards |
problem + solution + verify |
Complex reasoning |
| Abliteration |
Remove constraints |
harmful + harmless |
Uncensoring |
| Activation Steering |
Runtime control |
inference-time |
Reversible, composable |
Data Formats
# SFT
{"instruction": str, "output": str}
# DPO/ORPO/CPO
{"prompt": str, "chosen": str, "rejected": str}
# GRPO
{"prompt": str, "responses": [{"text": str, "score": float}]}
# RLVR
{"problem": str, "solution": str, "responses": [{"text": str, "correct": bool}]}
Recommended Training Order
1. SFT (base) → Instruction following
2. DPO (alignment) → Behavior change
3. GRPO/RLVR (verification) → Reasoning improvement
4. Calibration → Uncertainty handling
5. Anti-hallucination → Reduce confabulation
Backend Support Matrix
| Method |
MLX |
PyTorch |
Cloud |
| LoRA |
✅ |
✅ |
✅ |
| DPO |
✅ |
✅ |
✅ |
| ORPO |
✅ |
✅ |
✅ |
| GRPO |
✅ |
✅ |
✅ |
| CPO |
✅ |
✅ |
✅ |
| RLVR |
✅ |
❌ |
❌ |
| Abliteration |
✅ |
✅ |
❌ |
| Steering |
✅ |
❌ |
❌ |
Acceptance Criteria
Related
Summary
Create a new
training-methodsskill that documents all 8 stable training methods with their data requirements, use cases, and recommended training order.Context
ReAlign supports 8 stable training methods but there's no centralized skill to help users choose the right method for their goal.
What Does NOT Work
Implementation Approach
Create
.claude/skills/training-methods.mddocumenting:8 Stable Training Methods
Data Formats
Recommended Training Order
Backend Support Matrix
Acceptance Criteria
Related