Make short clips that stay consistent. With SkyReels-V3, you can animate a single reference image, extend a 5-second take into a longer moment, or drive a talking avatar with audio—while keeping identity, framing, and motion looking steady. Start with subtle actions, get a clean 5–8s result, then iterate.

| Prompt | Generated Clip |
|---|---|
Reference: a golden retriever sitting by a front porch. Prompt: The dog stands up, looks around, tail wagging gently, morning sunlight, handheld camera feel, natural motion, clean background. |
| Prompt | Generated Clip |
|---|---|
Use 2–4 references: (1) a person portrait, (2) a cafe interior, (3) a close-up of a cup. Prompt: The person picks up the cup, steam rises, soft window light, subtle camera push-in, consistent outfit and face. |
| Prompt | Generated Clip |
|---|---|
Input: a 5s clip of a model walking in a garden. Prompt: extend to 10s, maintain dress pattern and lighting, add a gentle breeze, slow pan left, keep the same scene and pace. |
| Prompt | Generated Clip |
|---|---|
Prompt: Two people sit in a cozy café and chat naturally at a small table. 5-second clip with gentle camera variation: start in a medium-wide shot showing both people from the waist up plus the table (cups visible), then do a subtle push-in toward the main speaker for a slightly tighter framing. Soft side window light, warm interior, gentle background bokeh, subtle hand gestures and head nods, realistic mouth movement (no exaggerated lip sync), steady camera, smooth transitions (no abrupt cuts), cinematic color, shallow depth of field |
Write actions you can “see” in one take: “stands up,” “looks left then back,” “blinks once,” “smiles slightly,” “hand lifts the cup,” “steam rises,” “camera slow push-in.” If you ask for too many actions at once, motion often turns rubbery—pick 1–2 primary actions and keep everything else stable.
Starter preset (most stable): Duration 5s • Aspect 16:9 or 9:16 • Camera: locked or slow push-in • Motion: low→medium • Background: unchanged. Extension preset: Start 5s → extend to 10s first (not 30s) • keep lighting/style the same • add only one new motion cue (e.g., “gentle breeze”). If you see drift, reduce duration, lower motion, and simplify camera movement.
If consistency matters, say it plainly: “keep face, outfit, and background unchanged; preserve framing and colors.” For multi-reference, assign roles so the model doesn’t mix them: “Ref1 controls the person/face, Ref2 controls the room/background, Ref3 controls the cup/prop.” Then add a single line: “Do not swap roles between references.”
(1) Image-to-video: “Keep the subject’s face and outfit identical. Subtle breathing and one natural blink. Soft daylight. Gentle handheld feel. Background unchanged.” (2) Video extension: “Extend smoothly to 10s. Preserve lighting, dress pattern, and background. Add a gentle breeze. Slow pan left. No sudden cuts.” (3) Talking avatar: “Accurate lip sync. Natural blinking. Small head nods. Clean background. Keep facial details stable across frames.”
Pick the workflow you need: reference image-to-video or video extension. This keeps your setup simple and avoids mismatched inputs.
Upload your reference image(s) or a starter video. Then write a prompt that states the action, camera feel, lighting, and what must stay consistent (identity, outfit, background).
Set duration, aspect ratio, and FPS. Start conservative (shorter length, moderate motion), generate, then iterate by adjusting motion intensity and scene constraints.
Create stable, story-ready clips with SkyReels-V3—animate a reference, extend a take, or build a talking avatar in minutes. Start with a clean 5-second clip and scale up once it looks right.
Start Creating Now