Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.06703

LLM structure optimization

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 148
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 40

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 115
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 53
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 33
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 22

LLM Agents (Prompting)

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 102
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference

Paper • 2502.04416 • Published Feb 6 • 12
Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published Feb 3 • 68
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published Jul 28 • 20

Scaling Literature

Collection of Scaling Law Papers

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Scaling Laws for Precision

Paper • 2411.04330 • Published Nov 7, 2024 • 8
Transcending Scaling Laws with 0.1% Extra Compute

Paper • 2210.11399 • Published Oct 20, 2022

Papers Storm 🌪️

A curated collection of research papers referenced in Panoram'IA program, offering a comprehensive resource for further exploration.

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77
Video Depth without Video Models

Paper • 2411.19189 • Published Nov 28, 2024 • 39
Mobile Video Diffusion

Paper • 2412.07583 • Published Dec 10, 2024 • 20

interesting papers

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 123
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13 • 191
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

Paper • 2403.06504 • Published Mar 11, 2024 • 56
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69

LLM structure optimization

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 148
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published Feb 9 • 40

Scaling Literature

Collection of Scaling Law Papers

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Scaling Laws for Precision

Paper • 2411.04330 • Published Nov 7, 2024 • 8
Transcending Scaling Laws with 0.1% Extra Compute

Paper • 2210.11399 • Published Oct 20, 2022

To Read collection

interesting papers to read

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published Mar 31 • 62
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 119
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 113
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142

Papers Storm 🌪️

A curated collection of research papers referenced in Panoram'IA program, offering a comprehensive resource for further exploration.

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 97
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77
Video Depth without Video Models

Paper • 2411.19189 • Published Nov 28, 2024 • 39
Mobile Video Diffusion

Paper • 2412.07583 • Published Dec 10, 2024 • 20

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17 • 115
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17 • 53
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16 • 33
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17 • 22

interesting papers

Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14 • 123
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13 • 191
BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

LLM Agents (Prompting)

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published Jan 9 • 102
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152

CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference

Paper • 2502.04416 • Published Feb 6 • 12
Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published Feb 3 • 68
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset

Paper • 2507.21033 • Published Jul 28 • 20

Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

Paper • 2403.06504 • Published Mar 11, 2024 • 56
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 152
Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19 • 69

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs