‘an expensive slot machine that outputs slop 98% of the time’
podcast and blog post tomorrow, i am ill
yes that’s a clothes hanger at the back i forgot to put in the hall, we only reveal our clean laundry here
It’s quite noteworthy how often these shots start out somewhat okay at the first prompt, but then deteriorate markedly over the following seconds.
As a layperson, I would try to explain this as follows: At the beginning, the AI is - to some extent - free to “pick” how the characters and their surroundings would look like (while staying within the constraints of the prompt, of course, even if this doesn’t always work out either).
Therefore, the AI can basically “fill in the blanks” from its training data and create something that may look somewhat impressive at first glance.
However, for continuing the shot, the AI is now stuck with these characters and surroundings while having to follow a plot that may not be represented in its training data, especially not for the characters and surroundings it had picked. This is why we frequently see inconsistencies, deviations from the prompt or just plain nonsense.
If I am right about this assumption, it might be very difficult to improve these video generators, I guess (because an unrealistic amount of additional training data would be required).
Edit: According to other people, it may also be related to memory/hardware etc. In that case, my guesses above may not apply. Or maybe it is a mixture of both.
I think it is impressive, but, who uses this? This is clearly not ready for anything useful. Right now it is just an expensive toy.
That was entertaining!
Get well soon! Drink lots of fluid and watch some good movies (the non AI kind).
A random thought occurred to me that Veo is probably “better” at being prompted with something like “the scene from X movie but with Y instead of Z” as it is a plagiarism engine first and foremost. I will never investigate this hypothesis, nor will I hold onto it past typing it here.
Missed opportunity to call it “Live Free or Fail Hard”
Kaneda just scooting to the side at the 14:05 mark like a Looney Tunes character caught with their pants down is comedy gold. I want to loop it with a MIDI rendition of Joplin’s “The Entertainer”.