Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2512.13687

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 201 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Training and tuning

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 23 days ago • 103
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99

Towards Scalable Pre-training of Visual Tokenizers for Generation

MiniMaxAI/VTP-Small-f16d64

Image Feature Extraction • 0.2B • Updated 22 days ago • 16.7k • 11
MiniMaxAI/VTP-Base-f16d64

Image Feature Extraction • 0.3B • Updated 22 days ago • 15.6k • 18
MiniMaxAI/VTP-Large-f16d64

Image Feature Extraction • 0.7B • Updated 22 days ago • 18.7k • 13
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2, 2025 • 107
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 151
Autoregressive Diffusion Models

Paper • 2110.02037 • Published Oct 5, 2021
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13, 2025 • 8

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Interesting articles

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 161
Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 125
Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 67

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published 22 days ago • 114
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published 9 days ago • 93
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published 9 days ago • 64

Tokenization methods and language modelling

Continuous Autoregressive Language Models

Paper • 2510.27688 • Published Oct 31, 2025 • 70
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Paper • 2505.13181 • Published May 19, 2025 • 9
Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25, 2025 • 73
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20, 2025 • 34

yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16, 2025 • 3 • 6
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29, 2025 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Paper • 2507.01953 • Published Jul 2, 2025 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2, 2025 • 76

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 201 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Interesting articles

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 161
Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 128
SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 125
Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 67

Training and tuning

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 23 days ago • 103
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99
MMGR: Multi-Modal Generative Reasoning

Paper • 2512.14691 • Published 22 days ago • 114
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Paper • 2512.23447 • Published 9 days ago • 93
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published 9 days ago • 64

Towards Scalable Pre-training of Visual Tokenizers for Generation

MiniMaxAI/VTP-Small-f16d64

Image Feature Extraction • 0.2B • Updated 22 days ago • 16.7k • 11
MiniMaxAI/VTP-Base-f16d64

Image Feature Extraction • 0.3B • Updated 22 days ago • 15.6k • 18
MiniMaxAI/VTP-Large-f16d64

Image Feature Extraction • 0.7B • Updated 22 days ago • 18.7k • 13
Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published 23 days ago • 99

Tokenization methods and language modelling

Continuous Autoregressive Language Models

Paper • 2510.27688 • Published Oct 31, 2025 • 70
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Paper • 2505.13181 • Published May 19, 2025 • 9
Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published Mar 25, 2025 • 73
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Paper • 2503.16430 • Published Mar 20, 2025 • 34

Test-Time Scaling with Reflective Generative Model

Paper • 2507.01951 • Published Jul 2, 2025 • 107
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7, 2025 • 151
Autoregressive Diffusion Models

Paper • 2110.02037 • Published Oct 5, 2021
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13, 2025 • 8

yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16, 2025 • 3 • 6
Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29, 2025 • 61
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Paper • 2507.01953 • Published Jul 2, 2025 • 18
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory

Paper • 2507.01945 • Published Jul 2, 2025 • 76

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

Paper • 2404.15653 • Published Apr 24, 2024 • 29
MoDE: CLIP Data Experts via Clustering

Paper • 2404.16030 • Published Apr 24, 2024 • 15
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

Paper • 2405.12130 • Published May 20, 2024 • 50
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention

Paper • 2405.12981 • Published May 21, 2024 • 33

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs