Joao Pedro Silva Dias Moura Mesquita's picture

Joao Pedro Silva Dias Moura Mesquita

inkasaras

·

joaopedrosdmm

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

zai-org/GLM-4.6

updated a collection 4 days ago

Difusion & Video

liked a model 4 days ago

alibaba-pai/Z-Image-Turbo-Fun-Controlnet-Union

View all activity

Organizations

upvoted 3 papers 5 days ago

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 25 days ago • 110

REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

Paper • 2511.22625 • Published 9 days ago • 45

VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation

Paper • 2511.17199 • Published 15 days ago • 7

upvoted a collection 22 days ago

Finance Commons

A large collection of multimodal financial documents in open data. • 7 items • Updated Jul 17, 2024 • 12

upvoted an article 22 days ago

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

Oct 30

•

66

upvoted a paper 24 days ago

Robot Learning from a Physical World Model

Paper • 2511.07416 • Published 26 days ago • 29

upvoted a collection 25 days ago

BERT-Chat

BERTs that chat • 2 items • Updated 9 days ago • 11

upvoted an article about 1 month ago

Article

The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix

Nov 3

•

47

upvoted a collection about 2 months ago

Cadmonkey

OpenSCAD code generator • 34 items • Updated 30 days ago • 11

upvoted a paper about 2 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 114

upvoted a paper 2 months ago

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 135

upvoted an article 2 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

+3

Sep 23

•

130

upvoted a collection 2 months ago

smol2operator Release

4 items • Updated Sep 23 • 23

upvoted 3 papers 3 months ago

F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10 • 73

OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15 • 104

upvoted 4 articles 3 months ago

Article

Asynchronous Robot Inference: Decoupling Action Prediction and Execution

+5

Jul 10

•

45

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

+5

Sep 11

•

166

Article

Introducing AI Sheets: a tool to work with datasets using open AI models!

+4

Aug 8

•

106

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

+2

Jul 23

•

46