Wan 2.7 Video: How to Use Instruction Editing & First/Last Frame Control
Step-by-step guide to Wan 2.7's new video capabilities: instruction-based editing, first/last frame control, 9-grid input, and subject+voice cloning.
Wan AI Team
Wan AI

Wan 2.7 Video launched in late March 2026 with several features that don't exist in any competitor. Here's how to actually use them.
Instruction-Based Video Editing
Upload an existing video, then describe what you want changed in natural language. Examples that work well: 'Change the background to a night scene with city lights' or 'Make the character's jacket red instead of blue' or 'Add rain falling in the scene'. The model modifies only what you specify while keeping everything else intact. Think of it like Photoshop's generative fill, but for video.
First/Last Frame Control
This is one of Wan 2.7's killer features. Provide two images: one for the video's opening frame, one for its closing frame. The model generates all motion in between with consistent subject identity. Use cases: product reveals (box → product), time transitions (day → night), emotional arcs (sad → happy expression). The key to good results: make sure both frames share the same subject/setting, but differ in one clear dimension.
9-Grid Image-to-Video
Arrange 9 reference images in a 3×3 grid. The model reads them left-to-right, top-to-bottom and generates a video that transitions through all referenced scenes. This gives you precise multi-scene control without needing to edit multiple clips together. Best for: storyboard-to-video workflows, multi-angle product showcases, sequential action scenes.
Subject + Voice Cloning
Upload a reference photo of your character plus an audio clip of their voice. The model generates video where both the visual appearance and vocal characteristics match your references. This is a game-changer for consistent character content across multiple videos.
Video Recreation
Provide a reference video and describe how you want it changed. The model preserves the original motion structure and camera movements while rebuilding the visual layer. 'Same movement but in anime style' or 'Same camera path but in a forest instead of a city' — that kind of thing.
Native Audio
Unlike previous versions where audio was added in post-production, Wan 2.7 generates audio alongside the video. Background music, ambient sounds, and speech are synchronized from the start.
Tips for Best Results
Be specific in your instructions — 'slightly warmer lighting' works better than 'make it look nice'. For first/last frame, keep the difference focused — one major change per generation. For 9-grid input, maintain consistent lighting across reference images.
The model is available on Alibaba Cloud now. Open-source release is expected in Q2 2026.


