Video-R4: Reinforcing Text-Rich Video Reasoning with Visual Rumination Paper • 2511.17490 • Published 20 days ago • 21
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published Oct 6 • 48
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 Dec 9, 2022 • 377
MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness Paper • 2505.20426 • Published May 26 • 7
IDEA-Research/grounding-dino-tiny Zero-Shot Object Detection • 0.2B • Updated May 12, 2024 • 995k • 88
RedHatAI/Llama-3.2-11B-Vision-Instruct-FP8-dynamic Text Generation • 11B • Updated Oct 2, 2024 • 2.49k • 24
Running on A10G Featured 586 Unofficial SDXL Turbo Img2Img Txt2Img 💬 586 Generate images from text or modify existing images