QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models Paper • 2512.19526 • Published 14 days ago • 11
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry Paper • 2512.18314 • Published 16 days ago • 8
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 17 days ago • 24
MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments Paper • 2512.19432 • Published 14 days ago • 11
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 18 days ago • 109
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 12 days ago • 52