John KuehJohn Kueh
All articles

Article· Updated June 2026

A cinematic hero for 80 cents cover

Here are three versions of the same landing-page hero on glp3.wiki. Each came from a different AI model, and each is a step closer to feeling real.

Hero with a soft pastel watercolour background and a dark headline.
V1 — a pastel watercolour from Gemini's Nanobanana. Pretty, but illustrated.
Hero with a photoreal cinematic still of a figure by a window and a white headline.
V2 — a photoreal still from GPT Image 2. The first leap: a photograph, not a drawing.
Hero with a frame from a cinematic video, the figure mid-motion, and a white headline.
V3 — that same still, animated by Veo 3.1. The second leap: it moves.

Watercolour, to photograph, to alive. The headline never changes; only the image does. The last version cost eighty cents, and it's the one that made the page stop looking like a brochure.

Watercolour to photograph

glp3.wiki started in soft pastel washes — Nanobanana, Gemini's image model, doing a Helen-Frankenthaler soak-stain thing on warm cream paper. I still like it. But a watercolour is decoration: it sets a mood and says nothing. For a page whose whole pitch is "the research, explained," I wanted a hero that read as real — a real room, real light, a real person.

That's what GPT Image 2 gave me. One prompt — a lone figure in a quiet, light-filled room, shot-on-film, deep negative space for the headline — and the page went from illustrated to photographed. That's V2, and on its own it was already a hero I'd ship.

A photograph is still a photograph

And that's the ceiling. A still, however good, is a frozen instant. You feel it the moment the page loads and nothing happens. The obvious next move is video — but I didn't want a new, unrelated clip. I wanted this frame, the one I'd already art-directed, to keep going.

That's the corner of this technology that actually works: image-to-video. You hand the model your still as the first frame, and it animates outward from it. Identity, framing, colour grade — all locked, because frame one is a picture you already approved. The model only has to invent motion.

The still, alive

Veo 3.1 takes the GPT Image still and moves it. Same frame on the left; on the right, it breathes.

A still frame of a figure standing in a sunlit room.
The still (GPT Image 2).
The same frame, moving (Veo 3.1).

Eighty cents of difference, and the page tips from "nice photo" to "alive." But getting there cleanly meant making three calls that mattered more than the model did.

When the loop is the problem

My first instinct was to loop the clip, and that ate the afternoon. A loop has to hide its seam — the jump where the last frame snaps back to the first. Ping-pong (play forward, then backward) is seamless by construction and looks wrong the instant there's any direction in the shot: the motion visibly runs in reverse. Generating the clip so its last frame matches its first, plus a half-second crossfade, genuinely works — I built it.

Then I looked at superpower.com, whose aesthetic I was chasing, and noticed their hero doesn't loop at all. It plays once and freezes on the last frame. No second lap, no seam to hide. The fix that had eaten my afternoon was deleting one HTML attribute. That's the version that shipped.

A meter you can read

The reason I'll iterate on video at all is the pricing. The image API charges per picture — the exact per-shot meter I wrote about paying for twice, cheap enough to use, expensive enough to flinch. Veo bills a flat rate per second of output, so the cost of a clip is knowable before I press go. My skill quotes it every time.

Veo 3.1 tier720pAn 8s hero
Fast$0.10 / sec$0.80
Standard$0.40 / sec$3.20
Lite$0.05 / sec$0.40

The glp3 hero is eight seconds of Fast at 720p — eighty cents. The skill prints that line before every run, so there's no meter ticking in the back of my head. The number is on the screen.

The real cost sink was the safety filter

The expensive failures weren't slow renders. They were the safety filter, and they were free in dollars but costly in round-trips. The glp3 audience skews fitness, so the obvious hero is a lean figure — and Veo kept rendering the clip, then refusing to hand it over, reading a shirtless subject as physique content. The fix: strip the body language out of the prompt ("a person," plain verbs, describe the room and the light) and keep the "allow adult" flag on.

The sharper finding came testing the cheap tier. Veo Lite is half the price of Fast — and it rejected the exact prompt Fast had accepted. So I gave it a fully-clothed subject instead, and it sailed through for twenty cents:

A still of a person in a knit sweater holding a mug by a window.
Clothed still — generated on GPT Image 2.
Animated on Veo Lite — 20¢, no rejection.

So the filter isn't anti-human, it's anti-physique — and Lite draws the line tighter than Fast. The rule I baked into the skill: Fast is the floor for any figure; Lite is for landscapes, objects, and clothed, ordinary subjects.

Where it ends up

Three models, each doing what it's best at: Nanobanana set the mood, GPT Image 2 made it real, Veo made it move. The hero ships on the glp3.wiki rebrand — a still I generated, animated for eighty cents, playing once and settling into a frame that holds. And it all folded into a Claude Code skill, veo-video-gen, that quotes the cost up front, defaults to play-once, and warns me when my wording is about to trip the filter.

None of it was a model breakthrough. It was knowing which model to reach for at each step, that the loop was the wrong instinct, and that a third of my failures were a filter I could talk my way around. The leverage, again, wasn't the AI. It was the judgment about how to use it.

Frequently asked

What did the moving hero actually cost?

Eighty cents — an 8-second clip at 720p on Veo 3.1 Fast, which bills $0.10 a second. The still it animates was free, generated on my GPT Image 2 skill against my ChatGPT plan. A handful of safety-filtered attempts cost nothing; the filter doesn't charge for what it refuses.

Why three different models?

Each is best at a different job. Nanobanana (Gemini) does soft, painterly imagery cheaply. GPT Image 2 does the most convincing photoreal stills I've used. Veo 3.1 does reliable image-to-video. I'm not loyal to a vendor — I reach for whichever model wins the specific step.

Why play once instead of looping?

A hero loop has to hide its seam. Ping-pong runs the motion backwards, which looks wrong; matching the first and last frame with a crossfade works but is fiddly. superpower.com just plays its hero once and freezes on the last frame — no seam to hide. The cheapest fix was deleting the loop.

Does the cheaper Lite tier work for people?

For clothed, ordinary subjects, yes — I animated one for 20 cents. But Lite's safety filter rejects physique and shirtless content that Fast accepts, so for a fit-subject hero Fast is the floor. The savings only hold for landscapes, objects, or fully-clothed figures.