AI generation is temporarily down. We're working to bring it back, sorry for the inconvenience. Your existing content is unaffected.

Docs Home /Prompting /Using references

⌘K

Using references

References are images you attach alongside your prompt to tell the model what a character, object, or style should look like. They are the single strongest lever for consistency, and the most common reason one generation looks better than another.

Two kinds of reference

Every model that accepts references separates them into two buckets. The composer shows the buckets as labeled drop zones above the prompt input.

Character refs

Photos of a specific person, creature, or object. The model locks their identity (face, build, clothing, distinguishing marks) across the generation. Used for consistency across shots.

Style ref

A single plate that defines look. Color palette, lighting quality, film texture, illustration medium. The subject of the style plate is ignored, only its visual language is used.

How many each model accepts

Model	Character refs	Style ref
Cinema	Up to 9	Yes, 1 plate
Motion	Up to 7	Yes, 1 plate
Muse	9-grid input (see note)	No
Drama	Yes (first-frame image)	No
Film family	Yes (first-frame image)	No
Audio	Yes (first-frame image)	No
Illustrate	No (text-only)	No
Compose	Up to 9	Yes, 1 plate
Spark	No (text-to-video)	No

Muse takes a 9-grid reference image: a single composite with up to 9 panels showing the same subject from different angles. The model packs all of them into one input slot. You do not need to upload nine separate files.

Picking good reference images

Clear face, neutral expression if you are locking a person. Ambiguous expressions confuse the identity lock.
Even lighting. Heavy shadow on half the face or a tinted filter bleeds into the output.
Uncluttered background. The model sometimes picks up pieces of the ref background as part of the identity (a specific wall color, a logo).
One subject per ref. Two people in a single character ref splits the lock between them.
Match aspect loosely. A vertical ref for a vertical output. A landscape ref for a landscape output. Mismatches still work but the model crops and sometimes trims features.

Multiple character refs

When you attach several refs for the same person, the model composites the identity across them. Upload a front-on shot, a three-quarter view, and a profile, and the model has enough angular coverage to synthesize novel angles faithfully. One-view locks sometimes drift when the subject turns.

For two different characters in the same shot, attach refs for both characters and mention them separately in the prompt by descriptive noun phrases ("the woman in the mustard cardigan", "the bearded man in the yellow oilskin").

Using a style plate

A style plate is a single reference that defines the look of the output without bringing its subject. Attach a film still, a painting, or a photo with the exact palette and lighting you want, then describe your own subject in the prompt. The model transfers the aesthetic and obeys the new subject.

Rule of thumb: if you can write a one-sentence caption for the look of the plate ("warm tungsten interior, deep shadows, 16mm grain"), it will work as a style ref. If the plate's subject is too distinctive, it can bleed into the output.

Common problems

The model ignores my reference. Usually the prompt overrides it: you asked for "a blonde surfer" but your reference is dark-haired. The model obeys the prompt first. Remove the conflicting adjective or update the ref.

The face is close but not right. Add a second and third character ref from different angles. One-view locks regress on head turns.

The style ref bled into the subject. Use a less distinctive plate, or describe your subject with stronger concrete nouns so the model has more signal to anchor to.

Where to go next

Character consistency for the full playbook on locking a face across many shots.
Fawna Compose for the image-gen model built specifically for reference-driven work.

← Previous

Writing a great image prompt

Character consistency