Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published about 1 month ago • 208
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization Paper • 2510.13554 • Published Oct 15 • 57
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22 • 27
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models Paper • 2508.01548 • Published Aug 3 • 13
GlimpsePrune Collection A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models. https://github.com/HVision-NKU/GlimpsePrune • 6 items • Updated Aug 5 • 1
Gaussian Splatting with Discretized SDF for Relightable Assets Paper • 2507.15629 • Published Jul 21 • 23
LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs Paper • 2506.21862 • Published Jun 27 • 36