ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper • 2511.20626 • Published Nov 25, 2025 • 43
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 89
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published Dec 2, 2025 • 253
PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation Paper • 2411.00163 • Published Oct 31, 2024
Breaking the Top-$K$ Barrier: Advancing Top-$K$ Ranking Metrics Optimization in Recommender Systems Paper • 2508.05673 • Published Aug 4, 2025
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 283
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 704
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 193
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 649