Making alignment via RLHF more scalable by automating human feedback...
Small typo
"harmless without comprising performance" to "harmless without compromising performance"
Great article, thanks!
Small typo
"harmless without comprising performance" to "harmless without compromising performance"
Great article, thanks!