MIDI

MIDI

Create Complete 3D Scenes from a Single Image

7 followers

MIDI is an open-source model for generating 3D scenes from a single image. Simultaneously generates multiple 3D objects with correct spatial relationships in ~40 seconds.
MIDI gallery image
MIDI gallery image
MIDI gallery image
MIDI gallery image
MIDI gallery image
Free
Launch Team
Auth0
Auth0
Start building with Auth0 for AI Agents, now generally available.
Promoted

What do you think? …

Zac Zuo

Hi everyone!

Found something cool – MIDI, a new open-source project that generates a complete 3D scene from a single image!

What's special:

🖼️ A single image (with objects segmented) creates a full 3D scene.
🧩 Multi-instance diffusion generates all objects simultaneously, ensuring correct scene layout.
🤝 A new multi-instance attention mechanism handles interactions between objects.
💨 It's fast – generating a scene in as little as 40 seconds.
🔓 Code, model, and a demo are all open-source.

This significantly advances single-image 3D generation. A collaboration between VAST and university researchers made it happen.

Try MIDI here to see the magic happen!