-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2511.15705
-
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
Paper • 2511.15705 • Published • 91 -
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 24 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 6 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 156
-
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7 -
CityGPT: Empowering Urban Spatial Cognition of Large Language Models
Paper • 2406.13948 • Published • 1 -
CityBench: Evaluating the Capabilities of Large Language Model as World Model
Paper • 2406.13945 • Published • 1 -
A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science
Paper • 2504.09848 • Published
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 26 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 108
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 18 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 9 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 22 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 9 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
MASS: Motion-Aware Spatial-Temporal Grounding for Physics Reasoning and Comprehension in Vision-Language Models
Paper • 2511.18373 • Published • 5 -
Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Paper • 2511.13288 • Published • 17 -
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens
Paper • 2511.19418 • Published • 26 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 108
-
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization
Paper • 2511.15705 • Published • 91 -
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
Paper • 2511.13593 • Published • 24 -
OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists
Paper • 2511.16931 • Published • 6 -
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 156
-
Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations
Paper • 2508.09789 • Published • 5 -
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents
Paper • 2508.13186 • Published • 18 -
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents
Paper • 2508.04038 • Published • 1 -
Prompt Orchestration Markup Language
Paper • 2508.13948 • Published • 48
-
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Paper • 2506.23219 • Published • 7 -
CityGPT: Empowering Urban Spatial Cognition of Large Language Models
Paper • 2406.13948 • Published • 1 -
CityBench: Evaluating the Capabilities of Large Language Model as World Model
Paper • 2406.13945 • Published • 1 -
A Survey of Large Language Model-Powered Spatial Intelligence Across Scales: Advances in Embodied Agents, Smart Cities, and Earth Science
Paper • 2504.09848 • Published
-
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
Paper • 2502.11573 • Published • 9 -
Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking
Paper • 2502.02339 • Published • 22 -
video-SALMONN-o1: Reasoning-enhanced Audio-visual Large Language Model
Paper • 2502.11775 • Published • 9 -
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39