intel / auto-round Goto Github PK
View Code? Open in Web Editor NEWSOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Home Page: https://arxiv.org/abs/2309.05516
License: Apache License 2.0