Dataset allenai/olmOCR-mix-1025 Viewer • Updated Oct 21, 2025 • 270k • 985 • 29 ahmed-masry/ChartQA Viewer • Updated Jun 22, 2024 • 32.7k • 1.34k • 30 ahmed-masry/ChartQAPro Viewer • Updated Apr 19, 2025 • 1.95k • 713 • 16 lmms-lab/DocVQA Viewer • Updated Apr 18, 2024 • 16.6k • 28.9k • 75
Text-to-images Training-Free Consistent Text-to-Image Generation Paper • 2402.03286 • Published Feb 5, 2024 • 67 ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation Paper • 2402.04324 • Published Feb 6, 2024 • 26 λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space Paper • 2402.05195 • Published Feb 7, 2024 • 19 FiT: Flexible Vision Transformer for Diffusion Model Paper • 2402.12376 • Published Feb 19, 2024 • 48
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation Paper • 2402.04324 • Published Feb 6, 2024 • 26
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space Paper • 2402.05195 • Published Feb 7, 2024 • 19
FiT: Flexible Vision Transformer for Diffusion Model Paper • 2402.12376 • Published Feb 19, 2024 • 48
Dataset allenai/olmOCR-mix-1025 Viewer • Updated Oct 21, 2025 • 270k • 985 • 29 ahmed-masry/ChartQA Viewer • Updated Jun 22, 2024 • 32.7k • 1.34k • 30 ahmed-masry/ChartQAPro Viewer • Updated Apr 19, 2025 • 1.95k • 713 • 16 lmms-lab/DocVQA Viewer • Updated Apr 18, 2024 • 16.6k • 28.9k • 75
Text-to-images Training-Free Consistent Text-to-Image Generation Paper • 2402.03286 • Published Feb 5, 2024 • 67 ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation Paper • 2402.04324 • Published Feb 6, 2024 • 26 λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space Paper • 2402.05195 • Published Feb 7, 2024 • 19 FiT: Flexible Vision Transformer for Diffusion Model Paper • 2402.12376 • Published Feb 19, 2024 • 48
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation Paper • 2402.04324 • Published Feb 6, 2024 • 26
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space Paper • 2402.05195 • Published Feb 7, 2024 • 19
FiT: Flexible Vision Transformer for Diffusion Model Paper • 2402.12376 • Published Feb 19, 2024 • 48