Into the VectorVerse
Posts
Thought Bubble 1: AI Video Is Here

Thought Bubble 1: AI Video Is Here

Dot-Com Hype, Cory Doctorow, and Prompt Engineering

David Priest
February 16, 2024

Into the VectorVerse is meant to be a blog for regular people — technical and non-technical — who are watching the world be re-written on all levels by generative AI. To that end, Jenna and I are wanting to dive deep into the technology itself and the ideas behind it.

But generative AI is also fun — to experiment with and to think about — and since we spend a lot of our time interacting with it in our daily work, we wanted to carve out a space for less structured reflections about news, articles, and our own work with it at the end of each week. Hence: Thought Bubbles. Here’s our first one!

AI Video is Here

In case you missed it, OpenAI just announced Sora, a text-to-video AI tool. You can check out some of the videos the team generated on their website. Right now, the tool is only available to testers — but when it launches, it promises to enable you to create up to 1-minute-long, photorealistic videos.

Wild stuff.

It’s worth pointing out that the hands still suck in these videos — in fact, motion adds a whole new level of uncanniness to the equation, as fingers flap like flags and daytime marketplaces seem to transform in the moments they’re off-frame into evening urban vistas. (Check out my tweet to see for yourself.)

Sora can generate these highly complex scenes... but the hands still look funky, the light source changes, things that are just out of frame completely transform. Fascinating stuff, but I'm still waiting for the use case -- beyond cheap advertising.
— david priest (@David_P_Priest)
8:55 PM • Feb 15, 2024

Imperfections aside, I’m sure many cinematographers, editors, and even animators, are feeling some anxiety right now. That’s understandable, and tools like this will likely shake up some industries sooner than you think.

But Sora also raises two big questions for me:

1. Will the proliferation of AI-generated videos make the internet less useful than it already is?

2. Will it make the internet less enjoyable than it already is?

If online videos become even less reliable than it already is, and they aren’t even aesthetically coherent (let alone enjoyable), what exactly is their function? Is it that you could scroll past them, half paying attention, and believe they’re real? Okay… but why? An easy (and likely) answer is spreading misinformation.

But if we zoom out, that’s really only one dimension of a larger pattern: tons of people are about to devise money-making schemes using this new technology. My question is: will that make the internet a more useful, interesting place?

A Question

Sora also reminds me of a bigger question — one I’ve been asking myself for a few months now. I happened to be getting dinner with a bunch of AI experts recently, and I raised it to them, too: “Is AI a bubble?”

I was surprised at the range of perspectives on it: one seasoned engineer said the rise of generative AI reminded him of the rise of cloud technology — particularly the strong feelings of early adopters and detractors, even as companies begin to shift strategy to account for the new technology. His opinion: not a bubble.

Others at the table were less optimistic, arguing that AI isn’t far off from cryptocurrency. I feel ambivalent: I’ve been reading a lot over the past year about the dot com bubble of nearly twenty-five years ago, and I think it’s not a bad point of comparison for generative AI. Sure, in the mad excitement for the world wide web, enabled by quickening connections and widening bandwidth, the industry over-indexed on the value of websites (most famously, Pets.com). But in the aftermath of the bubble, which popped in 2000, we were left with plenty of successful online companies, and lots of fertile ground for new, genuinely valuable internet tech to emerge — perhaps most notably, cloud technology.

Lots of people are over-indexing on generative AI, jamming the ChatGPT peg into every hole they can find, regardless of its shape. But generative AI is laying a foundation that will legitimately change the world we inhabit. The post-dot-com-bubble internet, after all, still saw the rise of Amazon Web Services, PayPal, Facebook, and Google.

Shortly after our dinner, Jenna sent me an article in which Cory Doctorow writes, seemingly with a shrug, “Of course AI is a bubble.” The real question, he says, is what kind of bubble it is. It’s a thoughtful article that raises good questions — particularly, what of importance will remain when the AI bubble inevitably deflates. That’s the question of a realist trying to find the truth in a world of idealists and doomers — and identifying the promise rightly will prove the difference between 2030’s AI equivalents of Pets.com and Google.com.

A Discovery

I want to think practically about generative AI. What can it actually do? The more I ask this, the more I continue to be struck by how much of a skill prompt engineering is. Think about it: prompting is basically how 99% of people who interact with this new technology do so — and how we prompt leads to wildly different outcomes.

I spent the past week tinkering with Amazon PartyRock, building a tool to help authors improve search engine optimization for their articles — mainly because I see lots of smart people who want to reach more readers via an increasingly noisy internet. (And yes, we’ll be talking about enshittification very soon.)

So I put together this tool where authors could input their articles and get a variety of suggestions for titles, subheds, meta text, keywords, and so on. Mostly, this was an experiment — and when I tested it, the results seemed promising. As soon as I shared it with some colleagues, though, problems emerged. In fact, the quality of suggestions tanked.

After frustration — and brief venting to Jenna — I decided to try something: suggesting authors input only their introductions rather than the whole piece. This was inspired by the really fantastic writing on prompt engineering by people smarter than me — notably Viktoria Semaan’s excellent blog post on mastering prompt engineering.

My idea was simple: generative AI tends to give better responses when it has shorter prompts to make sense of and respond to. The result? Responses immediately improved.

This isn’t revolutionary, but I’m increasingly convinced understanding the basics of prompt engineering will be a prerequisite for more and more technical and non-technical jobs. Read up on it sometime.

Reply

or to participate.