OpenAI’s new video model, Sora, has been causing quite a buzz in the AI community. Many are comparing it to existing AI video tools, such as Runway, to see just how good it really is. One interesting aspect of Sora is the way it handles prompts given by users. For example, when asked to create a reflection in a window, Sora uses a new Training Method called SpaceTime patches to predict the movement of objects through time.
Unlike other AI models, Sora is said to generate emergent behavior in videos without being explicitly programmed to do so. This means that it can create realistic camera movements, lighting effects, and even generate Minecraft clips that look impressively realistic. However, Sora is not without its flaws – it still makes mistakes like creating floating chairs or giving Minecraft players extra hearts.
Overall, Sora is showing great potential in the world of AI video modeling. Its ability to learn from vast amounts of data and produce groundbreaking results is truly impressive. Only time will tell just how far AI intelligence like Sora’s can go, but for now, it’s safe to say that OpenAI’s new video model is definitely a game-changer.
Watch the video by Space Kangaroo
Video Transcript
Did you know this was made by AI open ai’s new video model has been everywhere but I haven’t seen anyone actually compare it to what’s currently out there and they gave us the promps so I decid to put them in Runway which is the best AI video tool that’s been
Available and it’s pretty similar but that’s kind of because dogs are the worst possible Benchmark to use for AI art you can ask Dolly to draw the ugliest dog possible and it still draws something cute humans just aren’t wired to care how dogs look so even though
This one went the most viral it was the least interesting but the most interesting one actually tells us a lot about how Sora Works impr prompt tells the AI to make a reflection in a window now the easiest way to do this would be to have the neural network try to
Predict the color of every pixel and every framework of video unfortunately this is also the dumbest way of doing it because it’s just too many calculations that have to be precise precise with not enough information if the color of an eyeball is supposed to be green and the
AI makes it a slightly different shade of green through the course of the video the whole thing looks strange and bad and while most modern video AIS have setups more complicated than just predicting pitsel color alone Sora is on another level in their paper open AI talks about a new Training Method called
SpaceTime patches just like how tokens and Chachi BT or words are parts of words instead of individual letters Sora is doing the same thing with objects as they move through time beyond that open AI doesn’t really say how it works but they do claim that all the camera movement
Lighting even this Minecraft clip s generated is considered emergent Behavior they didn’t program it to do any of this here they show the difference the sheer amount of training made and you can see the estra training When comparing to Runway to obvious things like this Mammoth not looking like it
Was near a nuclear reactor and the less obvious things like the model’s reading comprehension now Source still makes mistakes it can’t show glass breaking it likes to make floating chairs and it gave the Minecraft player in exra heart but overall it’s pretty good in general it’s kind of unknown the path AI will
Take maybe a plateau and only improv when researchers find a new way of training models maybe will keep getting linearly better the Sor is really just blowing everything out of the water mainly because it was Strang with more data than anything else AI intelligence might follow something more like mors law
Video “How Good is OpenAI’s New Video Model?” was uploaded on 03/10/2024 to Youtube Channel Space Kangaroo