zkli-hub / mixture-of-depths Goto Github PK
View Code? Open in Web Editor NEWThis project forked from sramshetty/mixture-of-depths
An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
License: Other