Stories

OpenAI’s latest model takes video generation to the next level

OpenAI has dipped its toes, or should I say its whole body, into the world of video generation.

Aaron O'Leary
Aaron O'Leary
February 16th, 2024
Following in the footsteps of startups like RunwayML, the titular AI company announced Sora, a text-to-video AI model that’s capable of producing some stunning — almost concerning results.
It was announced yesterday, out of the blue, and it quickly took social media by storm. OpenAI CEO Sam Altman generated a number of videos based on people’s suggested prompts, including dogs recording a podcast, a drone race on Mars, and a variety of sea creatures riding bikes.
Sora works like the rest of OpenAI’s offerings — enter a prompt as simple or as detailed as you like, and it will generate a minute-long 1080p video in whatever style you want, populated with things, people, animals, and different environments. You can also craft your blockbuster movie just by dropping in a still image which the AI will then go on to animate, or a video that can be extended by Sora.
According to OpenAI, Sora was trained on jaround 10,000 hours of “high quality video” and is built upon a transformer architecture, which apparently gives the model a superior scaling performance. It also uses the same “recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data.”
Safety was a big concern for the team as well, so it’s not open to the public yet. Rather, the company is working with “red-teamers” — experts in things like misinformation, hate content, and bias — who will be testing the model thoroughly before any release to the wider public.
Sora — with all of its mind-blowing capabilities isn’t perfect though, and the team recognizes its weaknesses, particularly when it comes to physics, saying “It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect.”
As mentioned, Sora isn’t currently available to the wider public, and there’s no release date yet. However, you can continue to reply to Sam Altman and maybe he will generate your prompt, or you can take a look at this curated gallery of examples made by a maker.
This article originally appeared in the Product Hunt Daily Digest. Subscribe here.