Last Thursday OpenAI introduced Sora, its premier text-to-video generator, with beautiful, shockingly realistic videos showcasing the AI model’s capabilities. Sora is now available to a small number of researchers and creative types who will test the model before a broader public release. Some say this technology could spell disaster for the film industry and our collective deepfake problem.
“Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background,” said OpenAI in a blog post. “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”
Sora is OpenAI’s first venture into AI video generation, adding to the company’s AI-powered text and image generators, ChatGPT and Dall-E. It’s unique because it’s less of a creative tool, and more of a “data-driven physics engine,” as pointed out by Senior Nvidia Researcher Dr. Jim Fan. Sora is not just generating an image, but it’s determining the physics of an object in its environment and renders a video based on these calculations.
To generate videos with Sora, users can simply type in a few sentences as a prompt, much like AI-image generators. You can choose between a photorealistic or an animated style, producing shocking results in just a few minutes.
Sora is a diffusion model, meaning it generates video by starting with a blurry, static-filled video and slowly smooths it into the polished version. The videos Sora produces are longer, more dynamic, and flow together better than competitors. Sora feels like it creates real videos, whereas competitor models feel like a stop motion of AI images.
The videos produced by Sora have come a long way. These sample videos would have taken hours to produce by a real film crew or animators. Sora will likely be disruptive to the film industry in the same way that ChatGPT and AI-image generators have shocked the editorial and design world. It’s a technology that is both remarkable and yet frightening in terms of job security for video creators and computer game designers and programmers.
OpenAI says there are still a few tweaks to be worked out, including not understanding cause and effect. Sora may generate a video of a person taking a bite out of a cookie, but after, the cookie might not have a bite mark. OpenAI also says the model lacks spatial awareness. It may confuse left and right, and not understand how a person or object interacts with a scene.
Safety is also a primary concern, especially given how AI technology has been abused to create deepfakes in recent months. OpenAI says it will build tools to help detect misleading content, as well as apply existing technologies that reject harmful text prompts. However, given the way people have gotten around the protections of current AI models, it’s questionable how successful these efforts will be.
OpenAI didn’t say when Sora will be released to the public and you can bet OPEN AI competitors are burning the midnight oil to stay in this race.
OPEN AI Talks about SORA (Blog Post)
https://openai.com/sora
Video’s created by Sora
https://www.youtube.com/watch?v=HK6y8DAPN_0
If you have 20 minutes to invest, you can learn about the 10 stages of AI
https://www.youtube.com/watch?v=tFx_UNW9I1U
We are currently at Stage 5 and moving very fast
Deliver David's Tech Talk to my inbox
We'll send David's weekly Tech Talk to your inbox - including the MP3 of the actual radio spot. You'll never miss a valuable tip again!