Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs Paper • 2503.06342 • Published Mar 8 • 1
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs Paper • 2410.13276 • Published Oct 17, 2024 • 29
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Paper • 2407.00088 • Published Jun 25, 2024 • 12