AI generation is temporarily down. We're working to bring it back, sorry for the inconvenience. Your existing content is unaffected.

Docs Home /Prompting /Writing a great video prompt

⌘K

Writing a great video prompt

Video models reward specificity. Vague prompts produce vague output. The biggest single lever for quality is treating the prompt like a shot list: subject, action, scene, camera, style. Nothing more, nothing less.

The five-part formula

Structure every prompt as Subject, Action, Scene, Camera, Style. Keep the order. Models pay disproportionate attention to the first clause, so put your subject there. Keep the total length between 60 and 120 words. Shorter prompts under-direct the model, longer ones dilute attention.

Subject

Who or what is in the shot. Age, build, clothing, distinguishing features.

Action

What they do, in the present continuous or simple present.

Scene

Location, time of day, weather, key props, background detail.

Camera

Shot size, angle, lens, movement. Named verbs like dolly, pan, crane.

Style

Lighting, color palette, film stock, grain, mood words.

Camera language that works

Models have seen millions of tagged clips labelled with film terminology. They respond well to named techniques and poorly to metaphor. Use these as a starter vocabulary:

Shot size: extreme wide, wide, medium, medium close-up, close-up, extreme close-up.
Angle: low angle, eye level, high angle, overhead, Dutch tilt.
Movement: static, handheld, dolly in, dolly out, crane up, crane down, tracking shot, arc around subject, slow pan left, whip pan, orbit.
Lens: 24mm, 35mm, 50mm, 85mm, 135mm, macro, tilt-shift, anamorphic.
Depth: shallow depth of field, deep focus, rack focus, bokeh background.

One camera verb per prompt. Two moves in a single clause ("dolly in while panning left and craning up") tends to produce visual mush. Pick the move that matters most and let the other motions happen naturally.

Examples that land

Good Specific, cinematic, one camera move

Cinematic wide shot: a weathered fisherman in his sixties, thick
grey beard, yellow oilskin coat, mends a net on the deck of a small
trawler at dawn. The boat rolls gently on calm grey water. Low sun
breaks through coastal fog behind him. Slow dolly forward. 50mm
lens, shallow depth of field, desaturated color palette, soft film
grain.

Weak Vague subject, no camera, abstract mood

An old man doing fisherman stuff, atmospheric, beautiful lighting,
cinematic masterpiece, 4k ultra hd epic.

Style cues that matter

Four lightweight tags will do more than a paragraph of adjectives: a lens, a lighting quality, a color palette, and a film texture. Example style tail: 50mm lens, warm golden hour side light, earthy palette of rust and sage, 16mm film grain.

Avoid "cinematic masterpiece", "epic", "award-winning", "4K ultra HD", "masterpiece quality". These phrases have been absorbed as noise during training and either do nothing or bias toward oversaturated, generic output.

Describing motion

Motion is easy to over-specify. One clear action beats three vague ones. "A woman raises a coffee cup to her lips, exhaling steam" is better than "A woman is moving around gracefully, interacting with her environment in a natural way". Concrete, single-subject actions with a clear start and end read best.

What to leave out

Do not write stage directions addressed to the model ("now zoom in", "then cut to"). Models do not interpret editing instructions, they try to render them literally.
Do not describe sound if the model is silent. It wastes tokens on tiers like Drama.
Do not stack three synonymous adjectives ("sad melancholic mournful"). Pick one.
Do not use negation ("no text, no logos, no watermark"). Use the dedicated Negative Prompt field for that. Negation inside the main prompt inverts unpredictably.

Quick checklist

Is my subject in the first ten words?
Is there exactly one camera move?
Am I between 60 and 120 words?
Have I said lens, light, palette, texture?
Have I removed every "cinematic", "epic", "masterpiece"?

Where to go next

Using references for attaching images to lock subject or style.
Character consistency for keeping the same person across shots.

← Previous

Credits and plans

Writing a great image prompt