AI generation is temporarily down. We're working to bring it back, sorry for the inconvenience. Your existing content is unaffected.

Docs Home /Models /Fawna Muse

⌘K

Fawna Muse

Muse is an image-to-video specialist. It requires an input image, supports first and last frame, runs up to 15 seconds, and has the most permissive content policy of any Fawna tier.

Tier label: Muse
Engine: WAN 2.7
Price: From 22 credits per second
Aspect ratios: 16:9, 9:16, 1:1
Resolutions: 720p, 1080p
Durations: 2, 4, 6, 8, 10, 15 seconds
Audio: Toggle
Quality tiers: Fast, Standard
Character refs: 9-grid composite (one image, up to 9 panels)
Style ref: Not supported
Keyframes: First frame required, last frame optional
Negative prompt: Supported
Magic Prompt: Supported

When to pick Muse

You already have an image and want to animate it.
Your content needs permissive handling: Muse has the most relaxed content policy in the lineup.
You want the longest possible clip (15s).
You have a character sheet you can compose into a 9-grid for best-in-class consistency.

The input image is required

Muse is I2V only. There is no text-to-video mode. If you select Muse and send a prompt without an image, the composer shows a warning and the generation will not start. Always drop a first-frame image first.

For best results, the image should be clean, well-composed, and framed in the aspect ratio you want. The model preserves composition tightly.

The 9-grid reference

Muse's character consistency is driven by a single composite reference image with up to 9 panels arranged in a 3x3 grid. Each panel shows the same subject from a different angle or pose. This gives the model dense information about who the character is, without needing 9 separate ref uploads.

You can build a 9-grid manually in Compose, or use the Studio Library's "Build 9-grid" action on a character you have already generated. Attach the grid to the first-frame slot and Muse treats it as identity anchor plus scene input.

Strengths

Longest clips in the lineup (15s).
Most permissive content handling.
Dense 9-grid character lock is excellent for consistency-critical shots.
Smooth frame interpolation when both first and last frames are set.

Where it struggles

No text-to-video mode. You must supply an image.
No style ref slot. Style has to come from the first-frame image itself.
Audio is synthesized, not natively synced like Cinema or the Film family.

Prompt recipe

Template I2V prompt given a loaded first frame

[What happens in the clip: subject action, change, and motion].
[Single camera move]. [Any new lighting or mood cues].

Muse is already looking at the first frame, so you do not need to redescribe the scene in detail. Focus on what changes during the clip.

Example

Example 10-second clip with first-frame 9-grid

Maya stirs her tea slowly, then lifts the cup to her lips and
smiles at something off-camera. Soft handheld drift forward.
Natural golden-hour warmth deepens as a cloud passes.

Tips

If your clip feels still, bump the duration. Muse uses the extra frames well and avoids the "loop and stop" artifact that other models can produce at long durations.
Describe change, not the scene. The scene is in the image.
Use Magic Prompt for quick briefs. It will expand a short action into a full prompt.
Build your 9-grid once. Reuse it in every character scene. Consistency compounds.

Where to go next

Using references for details on the 9-grid input.
Fawna Compose to build the first frame and character grid.

← Previous

Fawna Motion

Fawna Spark