Tendril
Knowledge check · 15 questions
Test understanding of the differences between RLHF and DPO alignment techniques, including trade-offs in stability, cost, and data requirements
RLHF vs DPO: aligning models without breaking them — Quick Check
15 questions