sharkwyf / stable-alignment Goto Github PK
View Code? Open in Web Editor NEWThis project forked from agi-templar/stable-alignment
Efficient, Effective, and Stable alternative of RLHF. Code for the paper "Training Socially Aligned Language Models in Simulated Human Society".
License: Other