high-git-star
updated
Packing Input Frame Context in Next-Frame Prediction Models for Video
Generation
Paper
• 2504.12626
• Published
• 51
Paper
• 2505.09388
• Published
• 337
Qwen-Image Technical Report
Paper
• 2508.02324
• Published
• 272
Paper
• 2508.10104
• Published
• 298
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility,
Reasoning, and Efficiency
Paper
• 2508.18265
• Published
• 214
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
• 2508.05748
• Published
• 141
VibeVoice Technical Report
Paper
• 2508.19205
• Published
• 143
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Paper
• 2508.15144
• Published
• 65
Prompt Orchestration Markup Language
Paper
• 2508.13948
• Published
• 48
WebSailor: Navigating Super-human Reasoning for Web Agent
Paper
• 2507.02592
• Published
• 124
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM
Fine-Tuning Data from Unstructured Documents
Paper
• 2507.04009
• Published
• 54
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper
• 2506.07900
• Published
• 95
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
• 2506.18871
• Published
• 78
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper
• 2506.07491
• Published
• 50
InternVL3: Exploring Advanced Training and Test-Time Recipes for
Open-Source Multimodal Models
Paper
• 2504.10479
• Published
• 306
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
• 2504.17192
• Published
• 123
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought
Paper
• 2504.05599
• Published
• 85
Qwen2.5-Omni Technical Report
Paper
• 2503.20215
• Published
• 170
YuE: Scaling Open Foundation Models for Long-Form Music Generation
Paper
• 2503.08638
• Published
• 72
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Paper
• 2503.10291
• Published
• 36
Search-R1: Training LLMs to Reason and Leverage Search Engines with
Reinforcement Learning
Paper
• 2503.09516
• Published
• 38
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for
Open Base Models in the Wild
Paper
• 2503.18892
• Published
• 31
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
• 2501.03262
• Published
• 104
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper
• 2501.12326
• Published
• 64