Created in February 17, 2024
2024
Check out our new preprint, Aligning Large Language Models by On-Policy Self-Judgment.