knightnemo's picture

knightnemo

knightnemo

·

https://knightnemo.github.io

AI & ML interests

World Models, World Action Models, VLA Models, Test-time Adaptation & Self-Improvement, Dexterous Manipulation.

Recent Activity

upvoted a paper 1 day ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

upvoted a paper 1 day ago

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

updated a collection 2 days ago

Nano-World-Model

View all activity

Organizations

upvoted 2 papers 1 day ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published 5 days ago • 65

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

Paper • 2604.26694 • Published 3 days ago • 6

upvoted a paper 18 days ago

ELT: Elastic Looped Transformers for Visual Generation

Paper • 2604.09168 • Published 22 days ago • 20

upvoted 2 papers about 2 months ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 153

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

upvoted 3 papers 3 months ago

LoL: Longer than Longer, Scaling Video Generation to Hour

Paper • 2601.16914 • Published Jan 23 • 22

Advancing Open-source World Models

Paper • 2601.20540 • Published Jan 28 • 135

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Paper • 2601.19834 • Published Jan 27 • 25

upvoted 11 papers 5 months ago

Generative Neural Video Compression via Video Diffusion Prior

Paper • 2512.05016 • Published Dec 4, 2025 • 10

What about gravity in video generation? Post-Training Newton's Laws with Verifiable Rewards

Paper • 2512.00425 • Published Nov 29, 2025 • 53

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published Nov 19, 2025 • 54

WorldGen: From Text to Traversable and Interactive 3D Worlds

Paper • 2511.16825 • Published Nov 20, 2025 • 24

RynnVLA-002: A Unified Vision-Language-Action and World Model

Paper • 2511.17502 • Published Nov 21, 2025 • 28

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published Nov 20, 2025 • 135

HunyuanVideo 1.5 Technical Report

Paper • 2511.18870 • Published Nov 24, 2025 • 29

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

Paper • 2511.19861 • Published Nov 25, 2025 • 30

Terminal Velocity Matching

Paper • 2511.19797 • Published Nov 24, 2025 • 12

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation

Paper • 2511.20714 • Published Nov 25, 2025 • 50

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published Nov 17, 2025 • 134

upvoted a paper 6 months ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published Nov 13, 2025 • 101