CAMV - a Carlosvirella100 Collection

Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Carlosvirella100 's Collections

CAMV

updated Jun 28, 2025

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85
Efficient Generative Model Training via Embedded Representation Warmup

Paper • 2504.10188 • Published Apr 14, 2025 • 12
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15, 2025 • 19
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors

Paper • 2504.11427 • Published Apr 15, 2025 • 19
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15, 2025 • 63
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Paper • 2504.09454 • Published Apr 13, 2025 • 11
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11, 2025 • 55
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Paper • 2504.13169 • Published Apr 17, 2025 • 39
DMM: Building a Versatile Image Generation Model via Distillation-Based Model Merging

Paper • 2504.12364 • Published Apr 16, 2025 • 22
Iterative Self-Training for Code Generation via Reinforced Re-Ranking

Paper • 2504.09643 • Published Apr 13, 2025 • 34
DataDecide: How to Predict Best Pretraining Data with Small Experiments

Paper • 2504.11393 • Published Apr 15, 2025 • 18
Syzygy of Thoughts: Improving LLM CoT with the Minimal Free Resolution

Paper • 2504.09566 • Published Apr 13, 2025 • 11
AlayaDB: The Data Foundation for Efficient and Effective Long-context LLM Inference

Paper • 2504.10326 • Published Apr 14, 2025 • 25
InstantCharacter: Personalize Any Characters with a Scalable Diffusion Transformer Framework

Paper • 2504.12395 • Published Apr 16, 2025 • 16
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17, 2025 • 19
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14, 2025 • 15
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models

Paper • 2504.05303 • Published Apr 7, 2025 • 5
IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments

Paper • 2504.06827 • Published Apr 9, 2025
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

Paper • 2504.14899 • Published Apr 21, 2025 • 20
Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

Paper • 2504.13816 • Published Apr 18, 2025 • 18
X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15, 2025 • 35
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space

Paper • 2504.13835 • Published Apr 18, 2025 • 38
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs

Paper • 2504.14655 • Published Apr 20, 2025 • 21
OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21, 2025 • 35
FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21, 2025 • 47
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks

Paper • 2504.15521 • Published Apr 22, 2025 • 64
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

Paper • 2504.16078 • Published Apr 22, 2025 • 21
Vidi: Large Multimodal Models for Video Understanding and Editing

Paper • 2504.15681 • Published Apr 22, 2025 • 14
From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Paper • 2504.16080 • Published Apr 22, 2025 • 15
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model

Paper • 2504.15843 • Published Apr 22, 2025 • 16
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 30
Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22, 2025 • 56
VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models

Paper • 2504.15279 • Published Apr 21, 2025 • 78
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

Paper • 2504.17207 • Published Apr 24, 2025 • 30
RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Paper • 2504.17502 • Published Apr 24, 2025 • 55
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Paper • 2504.17789 • Published Apr 24, 2025 • 23
The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs

Paper • 2504.17768 • Published Apr 24, 2025 • 14
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25, 2025 • 49
DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models

Paper • 2504.15716 • Published Apr 22, 2025 • 12
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Paper • 2504.16427 • Published Apr 23, 2025 • 18
RepText: Rendering Visual Text via Replicating

Paper • 2504.19724 • Published Apr 28, 2025 • 31
YoChameleon: Personalized Vision and Language Generation

Paper • 2504.20998 • Published Apr 29, 2025 • 12
UniversalRAG: Retrieval-Augmented Generation over Multiple Corpora with Diverse Modalities and Granularities

Paper • 2504.20734 • Published Apr 29, 2025 • 62
WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Paper • 2504.21776 • Published Apr 30, 2025 • 59
Sadeed: Advancing Arabic Diacritization Through Small Language Model

Paper • 2504.21635 • Published Apr 30, 2025 • 59
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30, 2025 • 49
RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Paper • 2504.18904 • Published Apr 26, 2025 • 9
DeepCritic: Deliberate Critique with Large Language Models

Paper • 2505.00662 • Published May 1, 2025 • 54
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Paper • 2505.00703 • Published May 1, 2025 • 44
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published May 1, 2025 • 26
ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction

Paper • 2504.21855 • Published Apr 30, 2025 • 13
Improving Editability in Image Generation with Layer-wise Memory

Paper • 2505.01079 • Published May 2, 2025 • 29
Beyond One-Size-Fits-All: Inversion Learning for Highly Effective NLG Evaluation Prompts

Paper • 2504.21117 • Published Apr 29, 2025 • 26
Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

Paper • 2505.02471 • Published May 5, 2025 • 15
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Paper • 2505.01441 • Published Apr 28, 2025 • 39
LLaMA-Omni2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis

Paper • 2505.02625 • Published May 5, 2025 • 23
A Survey on Inference Engines for Large Language Models: Perspectives on Optimization and Efficiency

Paper • 2505.01658 • Published May 3, 2025 • 39
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 189
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Paper • 2505.02922 • Published May 5, 2025 • 28
Benchmarking LLMs' Swarm intelligence

Paper • 2505.04364 • Published May 7, 2025 • 20
StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

Paper • 2505.05467 • Published May 8, 2025 • 13
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training

Paper • 2505.00358 • Published May 1, 2025 • 26
OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents

Paper • 2505.03570 • Published May 6, 2025 • 8
LLM-Independent Adaptive RAG: Let the Question Speak for Itself

Paper • 2505.04253 • Published May 7, 2025 • 14
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models

Paper • 2505.02847 • Published May 1, 2025 • 29
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains

Paper • 2505.03981 • Published May 6, 2025 • 15
Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers

Paper • 2505.04842 • Published May 7, 2025 • 12
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Paper • 2505.04601 • Published May 7, 2025 • 29
HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation

Paper • 2504.21650 • Published Apr 30, 2025 • 16
PrimeIntellect/INTELLECT-2

33B • Updated May 13, 2025 • 29 • 205
Unified Continuous Generative Models

Paper • 2505.07447 • Published May 12, 2025 • 42
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder

Paper • 2505.07916 • Published May 12, 2025 • 134
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis

Paper • 2505.09358 • Published May 14, 2025 • 27
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15, 2025 • 47
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24
OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Paper • 2505.08617 • Published May 13, 2025 • 42
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15, 2025 • 120
Parallel Scaling Law for Language Models

Paper • 2505.10475 • Published May 15, 2025 • 83
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Paper • 2505.10046 • Published May 15, 2025 • 9
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Paper • 2505.10610 • Published May 15, 2025 • 55
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via texttt{D}ual-texttt{H}ead texttt{O}ptimization

Paper • 2505.07675 • Published May 12, 2025 • 21
AdaptThink: Reasoning Models Can Learn When to Think

Paper • 2505.13417 • Published May 19, 2025 • 83
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17, 2025 • 121
Thinkless: LLM Learns When to Think

Paper • 2505.13379 • Published May 19, 2025 • 50
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision

Paper • 2505.13427 • Published May 19, 2025 • 26
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Paper • 2505.12504 • Published May 18, 2025 • 24
Improving Assembly Code Performance with Large Language Models via Reinforcement Learning

Paper • 2505.11480 • Published May 16, 2025 • 8
Visual Agentic Reinforcement Fine-Tuning

Paper • 2505.14246 • Published May 20, 2025 • 32
Reward Reasoning Model

Paper • 2505.14674 • Published May 20, 2025 • 37
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98
Emerging Properties in Unified Multimodal Pretraining

Paper • 2505.14683 • Published May 20, 2025 • 133
Vid2World: Crafting Video Diffusion Models to Interactive World Models

Paper • 2505.14357 • Published May 20, 2025 • 27
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 45
Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20, 2025 • 20
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 55
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

Paper • 2505.15656 • Published May 21, 2025 • 15
RLVR-World: Training World Models with Reinforcement Learning

Paper • 2505.13934 • Published May 20, 2025 • 16
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20, 2025 • 62
Dimple: Discrete Diffusion Multimodal Large Language Model with Parallel Decoding

Paper • 2505.16990 • Published May 22, 2025 • 22
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification

Paper • 2505.16938 • Published May 22, 2025 • 121
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design

Paper • 2505.16175 • Published May 22, 2025 • 42
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Paper • 2505.17015 • Published May 22, 2025 • 9
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Paper • 2505.15966 • Published May 21, 2025 • 53
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88
Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models

Paper • 2505.17225 • Published May 22, 2025 • 64
Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25, 2025 • 145
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Paper • 2505.19914 • Published May 26, 2025 • 46
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Paper • 2505.20256 • Published May 26, 2025 • 19
Synthetic Data RL: Task Definition Is All You Need

Paper • 2505.17063 • Published May 18, 2025 • 11
Interleaved Reasoning for Large Language Models via Reinforcement Learning

Paper • 2505.19640 • Published May 26, 2025 • 15
Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84
s3: You Don't Need That Much Data to Train a Search Agent via RL

Paper • 2505.14146 • Published May 20, 2025 • 19
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow

Paper • 2505.17399 • Published May 23, 2025 • 14
MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems

Paper • 2505.18943 • Published May 25, 2025 • 25
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms

Paper • 2505.20322 • Published May 23, 2025 • 14
DetailFlow: 1D Coarse-to-Fine Autoregressive Image Generation via Next-Detail Prediction

Paper • 2505.21473 • Published May 27, 2025 • 16
MME-VideoOCR: Evaluating OCR-Based Capabilities of Multimodal LLMs in Video Scenarios

Paper • 2505.21333 • Published May 27, 2025 • 38
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning

Paper • 2505.17813 • Published May 23, 2025 • 58
GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

Paper • 2505.20355 • Published May 26, 2025 • 36
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Paper • 2505.19000 • Published May 25, 2025 • 42
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28, 2025 • 46
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection

Paper • 2505.20289 • Published May 26, 2025 • 10
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26, 2025 • 104
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset

Paper • 2505.21297 • Published May 27, 2025 • 29
ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Paper • 2505.23762 • Published May 29, 2025 • 45
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Paper • 2505.23380 • Published May 29, 2025 • 22
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Paper • 2505.23747 • Published May 29, 2025 • 69
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Paper • 2505.23606 • Published May 29, 2025 • 14
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28, 2025 • 45
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 143
Taming LLMs by Scaling Learning Rates with Gradient Grouping

Paper • 2506.01049 • Published Jun 1, 2025 • 38
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Paper • 2505.21523 • Published May 23, 2025 • 13
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 52
Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning

Paper • 2506.03136 • Published Jun 3, 2025 • 25
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published May 30, 2025 • 35
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks

Paper • 2506.00411 • Published May 31, 2025 • 31
DINGO: Constrained Inference for Diffusion LLMs

Paper • 2505.23061 • Published May 29, 2025 • 31
Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Paper • 2506.01413 • Published Jun 2, 2025 • 16
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3, 2025 • 36
ComposeAnything: Composite Object Priors for Text-to-Image Generation

Paper • 2505.24086 • Published May 30, 2025 • 5
Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers

Paper • 2506.03065 • Published Jun 3, 2025 • 27
From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Paper • 2505.23059 • Published May 29, 2025 • 13
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

Paper • 2505.21541 • Published May 24, 2025 • 7
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs

Paper • 2505.24120 • Published May 30, 2025 • 49
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Paper • 2505.23656 • Published May 29, 2025 • 25
Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design

Paper • 2506.04734 • Published Jun 5, 2025 • 21
Image Editing As Programs with Diffusion Models

Paper • 2506.04158 • Published Jun 4, 2025 • 24
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5, 2025 • 18
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5, 2025 • 27
Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

Paper • 2506.02454 • Published Jun 3, 2025 • 7
FlexPainter: Flexible and Multi-View Consistent Texture Generation

Paper • 2506.02620 • Published Jun 3, 2025 • 14
FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion

Paper • 2506.01111 • Published Jun 1, 2025 • 31
Audio-Aware Large Language Models as Judges for Speaking Styles

Paper • 2506.05984 • Published Jun 6, 2025 • 15
Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data

Paper • 2506.04120 • Published Jun 4, 2025 • 7
ConfQA: Answer Only If You Are Confident

Paper • 2506.07309 • Published Jun 8, 2025 • 10
Through the Valley: Path to Effective Long CoT Training for Small Language Models

Paper • 2506.07712 • Published Jun 9, 2025 • 18
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

Paper • 2506.05573 • Published Jun 5, 2025 • 82
Vision Transformers Don't Need Trained Registers

Paper • 2506.08010 • Published Jun 9, 2025 • 22
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models

Paper • 2506.07177 • Published Jun 8, 2025 • 23
Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor

Paper • 2506.07932 • Published Jun 9, 2025 • 12
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation

Paper • 2506.09790 • Published Jun 11, 2025 • 53
SAFE: Multitask Failure Detection for Vision-Language-Action Models

Paper • 2506.09937 • Published Jun 11, 2025 • 9
Ming-Omni: A Unified Multimodal Model for Perception and Generation

Paper • 2506.09344 • Published Jun 11, 2025 • 31
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning

Paper • 2506.09513 • Published Jun 11, 2025 • 102
AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Paper • 2506.10974 • Published Jun 12, 2025 • 19
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 66
Comment on The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Paper • 2506.09250 • Published Jun 10, 2025 • 27
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Paper • 2506.10540 • Published Jun 12, 2025 • 37
Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation

Paper • 2506.11924 • Published Jun 13, 2025 • 34
Marrying Autoregressive Transformer and Diffusion with Multi-Reference Autoregression

Paper • 2506.09482 • Published Jun 11, 2025 • 45
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs

Paper • 2506.14245 • Published Jun 17, 2025 • 45
AR-RAG: Autoregressive Retrieval Augmentation for Image Generation

Paper • 2506.06962 • Published Jun 8, 2025 • 28
Scaling Test-time Compute for LLM Agents

Paper • 2506.12928 • Published Jun 15, 2025 • 63
Reasoning with Exploration: An Entropy Perspective

Paper • 2506.14758 • Published Jun 17, 2025 • 31
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation

Paper • 2506.14028 • Published Jun 16, 2025 • 93
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published Jun 24, 2025 • 53
Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models

Paper • 2506.18945 • Published Jun 23, 2025 • 40
Learning to Skip the Middle Layers of Transformers

Paper • 2506.21103 • Published Jun 26, 2025 • 18

Collection guide
Browse collections

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs