-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 201 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
Collections
Discover the best community collections!
Collections including paper arxiv:2512.13687
-
MiniMaxAI/VTP-Small-f16d64
Image Feature Extraction • 0.2B • Updated • 16.7k • 11 -
MiniMaxAI/VTP-Base-f16d64
Image Feature Extraction • 0.3B • Updated • 15.6k • 18 -
MiniMaxAI/VTP-Large-f16d64
Image Feature Extraction • 0.7B • Updated • 18.7k • 13 -
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 99
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 107 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 8
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33
-
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 161 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 125 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 67
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 99 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 114 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 93 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 64
-
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 70 -
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
Paper • 2505.13181 • Published • 9 -
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Paper • 2503.19325 • Published • 73 -
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Paper • 2503.16430 • Published • 34
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 3 • 6 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 201 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
General Agentic Memory Via Deep Research
Paper • 2511.18423 • Published • 161 -
Diffusion Language Models are Super Data Learners
Paper • 2511.03276 • Published • 128 -
SAM 3: Segment Anything with Concepts
Paper • 2511.16719 • Published • 125 -
Back to Basics: Let Denoising Generative Models Denoise
Paper • 2511.13720 • Published • 67
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 99 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 114 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 93 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 64
-
MiniMaxAI/VTP-Small-f16d64
Image Feature Extraction • 0.2B • Updated • 16.7k • 11 -
MiniMaxAI/VTP-Base-f16d64
Image Feature Extraction • 0.3B • Updated • 15.6k • 18 -
MiniMaxAI/VTP-Large-f16d64
Image Feature Extraction • 0.7B • Updated • 18.7k • 13 -
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 99
-
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 70 -
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
Paper • 2505.13181 • Published • 9 -
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Paper • 2503.19325 • Published • 73 -
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Paper • 2503.16430 • Published • 34
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 107 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 8
-
yandex/stable-diffusion-3.5-medium-alchemist
Text-to-Image • Updated • 3 • 6 -
Ovis-U1 Technical Report
Paper • 2506.23044 • Published • 61 -
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Paper • 2507.01953 • Published • 18 -
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
Paper • 2507.01945 • Published • 76
-
CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Paper • 2404.15653 • Published • 29 -
MoDE: CLIP Data Experts via Clustering
Paper • 2404.16030 • Published • 15 -
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper • 2405.12130 • Published • 50 -
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention
Paper • 2405.12981 • Published • 33