AI generation is temporarily down. We're working to bring it back, sorry for the inconvenience. Your existing content is unaffected.

Docs Home /Scene Mode /Narrators and voices

⌘K

Narrators and voices

Scene Mode supports single-narrator and multi-narrator stories. Every scene is narrated by exactly one voice, but which voice can vary scene-to-scene based on the character assigned to that chunk. This page covers how the cast works, how to set voices, and how dialogue scripts are parsed.

Single-narrator (the default)

When the Multi-narrator toggle is off, one voice reads every scene in the storyboard. You pick the voice when you create the storyboard. All scenes inherit that voice unless you override them one-by-one in the scene editor later.

Single-narrator is the right choice for:

Explainers, tutorials, product walkthroughs.
Essay-style monologues.
Any script where a single voice carries the whole story.

Multi-narrator

Check Multi-narrator on the Create page and a Cast panel appears. Each row is a speaking character. Fawna auto-inserts a Narrator row that cannot be deleted: it owns any script line that does not have a character prefix.

For each character row, you set:

Name: What the character is called. The script prefix must match this name exactly (case-insensitive).
Voice: Either Auto (Fawna picks a voice based on name and gender) or Pick (you choose the voice yourself from the provider list).
Preset: Optional. Applies a saved voice preset with specific slider settings (stability, style, speed).
Face image: Optional. A character reference used when generating scene images. Fawna attaches it as a reference when the scene is assigned to this character.

Script formatting

In multi-narrator mode, Fawna looks for Name: prefixes at the start of each line. Anything with a prefix gets assigned to that character. Anything without a prefix falls back to the Narrator.

Example Multi-speaker script

Narrator: On the edge of the pine forest, a small fire burned
low between two travellers.

Maya: We should push on. The storm is four hours out.
Jan: The horses need rest. We wait.
Maya: You will regret that.

Narrator: By dawn, the snow was waist-deep and the horses
were gone.

Prefixes need to match a cast name. Misspellings fall back to Narrator (and you'll see a warning in the scene panel after the split). Fix either the script or the cast row name to match.

Voice providers

Fawna supports three TTS providers, ranked by quality:

Provider	Tier	Notes
ElevenLabs	Professional	Highest quality. Full slider controls: stability, similarity boost, style, speed.
Google Cloud TTS	Free	60+ neural voices, 20+ languages. Speed control only.

Voice settings

ElevenLabs exposes the full set of knobs. Tune them per character or save them as presets.

Speed: 0.5x to 2x. 1x is natural pace.
Stability: 0 to 100. Higher means more consistent delivery. Lower introduces expressive variation (but risks drifting off-character).
Similarity boost: 0 to 100. How strictly to adhere to the base voice's identity. Higher is safer.
Style: 0 to 100. Emotional expressiveness. Zero for documentary-neutral, higher for dramatic readings.

Voice presets

A preset captures a voice plus its settings. Save from the voice settings panel as a named preset, and reuse it for future characters or storyboards. Presets are scoped to you (other users cannot see them).

Typical presets people save:

Documentary narrator: Mike / stability 70 / style 10 / speed 0.95.
Intimate memoir: Rachel / stability 50 / style 30 / speed 0.9.
Energetic explainer: Adam / stability 45 / style 20 / speed 1.1.

Face images (character reference)

Attach a face image to a character and Fawna will use it as an image reference when generating scene images assigned to that character. This helps maintain visual identity across scenes where the character is on-screen.

The face image is applied on top of any storyboard-level reference image. It does not lock identity the way Fawna Compose's 9-ref slot does, but it biases the model toward the same person.

Strongest character consistency workflow. Build the character sheet in Fawna Compose first (see Fawna Compose). Save the best frontal as the face image for that cast row. Per-scene, consistency will be noticeably tighter.

Where to go next

Editing scenes for per-scene speaker assignment after the split.
Audio Generation if you need to produce standalone voice clips outside of a storyboard.

← Previous

Creating a storyboard

Editing scenes