-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2508.06471
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 2.33k • 88 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 339 • 4
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
Paper • 2501.11425 • Published • 109 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Paper • 2507.06229 • Published • 75
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
zai-org/GLM-4.5
Text Generation • 358B • Updated • 22.6k • • 1.39k -
zai-org/GLM-4.5-FP8
Text Generation • 358B • Updated • 2.98k • 75 -
GLM 4.5 Demo (API)
🏃106Chat with GLM-4.5 to get answers and reasoning
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 494 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 331k • • 1.13k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 1.21M • • 12.9k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • 685B • Updated • 65.6k • • 900
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 240 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 38 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 64 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 272
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
zai-org/GLM-4.5
Text Generation • 358B • Updated • 22.6k • • 1.39k -
zai-org/GLM-4.5-FP8
Text Generation • 358B • Updated • 2.98k • 75 -
GLM 4.5 Demo (API)
🏃106Chat with GLM-4.5 to get answers and reasoning
-
Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment
Paper • 2505.10597 • Published -
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values
Paper • 2504.05535 • Published • 44 -
nvidia/HelpSteer3
Viewer • Updated • 133k • 2.33k • 88 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 339 • 4
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 494 -
zai-org/GLM-4.6
Text Generation • 357B • Updated • 331k • • 1.13k -
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 1.21M • • 12.9k -
deepseek-ai/DeepSeek-V3.2-Exp
Text Generation • 685B • Updated • 65.6k • • 900
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 277 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 240 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 259
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
Paper • 2501.11425 • Published • 109 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 95 -
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving
Paper • 2507.06229 • Published • 75
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 192 -
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Paper • 2508.14444 • Published • 38 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 64 -
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention
Paper • 2506.13585 • Published • 272