AI August 22, 2025 2 min read

AI Techniques for Advanced Workflow Management

In this guide, we’ll break down a workflow that combines Florence2, WAN 2.1 VACE, and smart masking techniques to composite someone onto an AI-generated The Witcher 3 inspired body.

What This Workflow Does

Download the workflow v1 here. Without arm tracking.

Download the workflow v2 here. With arm tracking.

Link to Reddit thread here.

Arm movement tracking.

The reference video provides the motion data (head turns, expressions, movement) and we extract that data to pass off to Florence2

Florence2 detects and masks the person’s head from the source video
- Florence2 is run in caption-to-phrase grounding mode with the prompt "head".
Remove Background and Refine Mask
- BiRefNet-RMBG (background remover) is applied.
- The mask is expanded and cleaned with GrowMask and MaskComposite.

Apply your reference photo
- A still image from The Witcher 3 is loaded.
- WAN VACE encodes this as the reference style and clothing.

Combine WAN text and VACE embeds, then sit back and wait for generation
- RTX 4090 it took about 40 minutes.
- RTX PRO 6000 took 4 minutes.

Save and review!

Creative Applications

Fan Edits
- Place yourself into anything!
Virtual Production
- Generate quick previews of how an actor can look in full costume without expensive shoots.
Cosplay Visualization
- See your cosplay concept in motion before you build it.

Final Thoughts

Whether you’re a fan editor, digital artist, or just experimenting, this is a glimpse into the future of video generation.