Microsoft partner OpenAI is launching a new text-to-video artificial intelligence model by the name of Sora.
OpenAI Co-founder and CEO Sam Altman took to Twitter to make the announcement that the company is opening access to “a limited number of creators.”
While Sora is currently limited to ‘red team’ users to evaluate the potentiality of harm and risk to the general population, the diffusion model will eventually be available for more people to play around with soon.
According to the Sora website, the model uses a transformer architecture built on past research in DALL-E3 from highly ‘descriptive captions’ historicals to generate the training data eventually used for the video creation.
With an emphasis on leveraging the descriptive captions prompts from DALL-E3 Sora “is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background. The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”
Despite the Sora website hosting a ton of elegant video examples, the platform has its limitations the company would like to inform users of that include struggling to simulate the physics of complex scenes or understanding “specific instances of cause and effect.”
Other place Sora may stumble in recreating text prompts visually is in spatial detailing such as confusing left and right or precise descriptions of events over time.
Sora marks another advancement for AI even though Google began playing around in the space late last year. Google’s Lumiere project serves a remarkably similar audience to Sora but with more restrictions such as a five second limitation while Sora can produce videos a minute long in length.
Lumiere also uses a fundamentally different diffusion model that attempts to extend space and time of a still image using a new model called Space-Time-U-Net or STUNet. Another player in the text-to-video AI race is Runway which showed up mid 2022 and bascially offered what amounted to impressive parallax effect on images.
OpenAI’s Sora builds on the foundation Runway and Lumiere have established and will another feature to the AI tool set for users.