edisonwd / flash-llm Goto Github PK
View Code? Open in Web Editor NEWThis project forked from alibabaresearch/flash-llm
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
License: Apache License 2.0