view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 5 days ago • 223
view article Article 🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! Jan 29 • 21
view article Article Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac Oct 29 • 27
view article Article How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons Sep 30 • 42
Open X-Embodiment Collection Datasets from Open X-Embodiment (OXE) in LeRobot dataset format • 57 items • Updated Oct 2 • 7
RDT 2 Collection RDT 2, the sequel to RDT-1B, is the first foundation model that achieves zero-shot deployment on unseen embodiments for simple open-vocabulary tasks. • 4 items • Updated Sep 26 • 16
view article Article `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` +9 Sep 16 • 47
view article Article Welcome PaliGemma 2 – New vision language models by Google +2 Dec 5, 2024 • 162
π_0: A Vision-Language-Action Flow Model for General Robot Control Paper • 2410.24164 • Published Oct 31, 2024 • 30
π_{0.5}: a Vision-Language-Action Model with Open-World Generalization Paper • 2504.16054 • Published Apr 22 • 3
EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control Paper • 2508.21112 • Published Aug 28 • 77
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 398