Running 119 The ultimate guide to RL environments: building and scaling them in the LLM era π 119 Building and scaling RL environments for LLM training
meta-llama/Meta-Llama-3-8B-Instruct Text Generation β’ 8B β’ Updated Jun 18, 2025 β’ 1.7M β’ β’ 4.52k
microsoft/tapex-large-finetuned-wtq Table Question Answering β’ 0.4B β’ Updated Jan 12, 2024 β’ 935 β’ 78