AI generation is temporarily down. We're working to bring it back, sorry for the inconvenience. Your existing content is unaffected.

Docs Home /Scene Mode /Editing scenes

⌘K

Editing scenes

After the script split, the storyboard lands you in the scene grid. Each card is one scene. Click a card to edit it in the right sidebar. This page covers every control the scene editor exposes.

The scene grid

The grid fills the center of Scene Mode. Scenes render as tiles with a thumbnail (or a placeholder if no image has been generated yet), a scene number, status dots, and the first few words of the text chunk.

Status dots appear along the bottom of each tile:

Image dot: filled when a scene image exists.
Audio dot: filled when narration audio exists.
Motion dot: filled when motion video exists.

A tile with all three dots filled is complete. Grey dots mean the asset has not been generated yet.

Selecting scenes

Click a tile to select it. The preview player scrubs to that scene, the sidebar switches to the scene editor. Ctrl/Cmd-click to multi-select. Shift-click to select a range.

With multiple scenes selected, a bulk-action bar appears along the bottom:

Select all / Clear
Download zip (every selected scene's assets as a zip archive)
Delete

Text chunk

The text chunk is what the narrator reads for this scene. It is also the primary input to the image generation prompt. Edit it in the sidebar and click Save.

Changing the text does not automatically regenerate the image or audio. You need to explicitly regenerate those assets to apply the change.

Visual direction

Fawna drafts a short visual direction prompt for each scene, inferred from the text. This is what gets sent to the image model, not the raw text chunk. You can edit the visual direction directly for finer control.

Example: a text chunk of "The storm was four hours out." might produce a visual direction of "A darkening sky above a pine forest, heavy grey clouds rolling in from the west, early evening light, wide landscape shot, 35mm lens, moody color palette."

Motion prompt

The motion prompt describes what should happen in the scene's video. Like visual direction, Fawna drafts one from the text chunk, and you can edit it. If empty, the motion generator uses a neutral default: slow push forward on the image.

Image settings

Image model: Imagen 4 or Gemini 3. Gemini is the default. Imagen tends to be crisper; Gemini is better when you want stylistic flexibility.
Image tier: Fast, Standard, or Ultra. Fast is for drafts, Standard is the default, Ultra is for a hero frame.
Regenerate: Button. Re-runs the model with the current settings and overwrites the scene image.

Motion settings

Motion model: Veo 3.1, Veo 3.1 Fast, or Veo 3.1 Lite. Lite is the cheapest, Fast is the middle tier, full Veo 3.1 is the highest quality.
Resolution: 720p or 1080p.
Audio mode: Video only (no audio track from the motion model) or Video + audio (native Veo audio, which is synced to the generated motion and can include dialogue, SFX, ambient).
Duration: 4, 6, or 8 seconds. Usually match this to the narration audio length.
Regenerate: Button. Requires an image to exist.

Audio settings

The audio side is simpler: the speaker defaults to either the global narrator (single-narrator storyboards) or the character assigned by the script split (multi-narrator). You can override it from the dropdown and regenerate. Voice settings sliders are available in the storyboard's Characters panel.

Video placement

The motion video is typically shorter than the narration audio (4-8s vs whatever the voice takes). Video placement controls when during the scene the motion plays, and what the viewer sees the rest of the time (the static image).

Placement	Behavior
Loop	Motion loops for the whole scene duration.
Start	Motion plays at scene start, static image for the rest.
End	Static image first, motion plays on the last N seconds.
Middle	Motion centered in the scene, static before and after.
Custom	Set start and end offsets manually.

Trimming

Per-scene trim fields (Trim in, Trim out) let you shave off the start or end of the audio without regenerating it. Useful if the TTS produces a small leading silence or trailing pause that throws off pacing.

Splitting a scene

The right-click menu on a scene card includes Split here, which divides the scene into two at its midpoint. Useful when the split model merged two beats that should really be separate shots. The new scene inherits the text and settings, then you can edit each half independently.

Reordering

Drag scene tiles to reorder. The script stays the same, but the visual order of scenes in the final video follows the grid. Rarely useful in linear narration, occasionally useful when the split model produced an awkward chunk boundary.

Where to go next

Generation pipeline: bulk generation, progress tracking, failures.
Importing to the editor once scenes are in good shape.

← Previous

Narrators and voices

Generation pipeline