UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs Paper • 2512.03383 • Published 4 days ago • 3
Sasha: Creative Goal-Oriented Reasoning in Smart Homes with Large Language Models Paper • 2305.09802 • Published May 16, 2023
Flavors of Moonshine: Tiny Specialized ASR Models for Edge Devices Paper • 2509.02523 • Published Sep 2 • 7
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19 • 17
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models Paper • 2503.22879 • Published Mar 28 • 9
Quamba: A Post-Training Quantization Recipe for Selective State Space Models Paper • 2410.13229 • Published Oct 17, 2024 • 1
Efficient Low-rank Backpropagation for Vision Transformer Adaptation Paper • 2309.15275 • Published Sep 26, 2023 • 1
MobileTL: On-device Transfer Learning with Inverted Residual Blocks Paper • 2212.03246 • Published Dec 5, 2022 • 1
Moonshine: Speech Recognition for Live Transcription and Voice Commands Paper • 2410.15608 • Published Oct 21, 2024
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics Paper • 2102.01672 • Published Feb 2, 2021
NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation Paper • 2112.02721 • Published Dec 6, 2021
X-PARADE: Cross-Lingual Textual Entailment and Information Divergence across Paragraphs Paper • 2309.08873 • Published Sep 16, 2023
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18, 2024 • 39
What If We Recaption Billions of Web Images with LLaMA-3? Paper • 2406.08478 • Published Jun 12, 2024 • 41
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents Paper • 2404.10774 • Published Apr 16, 2024 • 6