Fine-Tuning Qwen-Image-Edit and Using Wan 2.2 to Generate Multiple Actors
Links + Notes ๐ https://www.oxen.ai/blog
Join Fine-Tune Fridays ๐ง https://oxen.ai/community
Discord ๐ฟ https://discord.com/invite/s3tBEn7Ptg
Use Oxen AI ๐ https://oxen.ai/
Oxen.ai offers one click fine-tuning or fine-tunes for you! Built on top of the worlds best data versioning tool, we offer tools to automate model evals, generate synthetic data, and effortlessly fine-tune models.
--
Chapters
0:00 The Task: Generating Conan OโBrien interviewing Will Smith
2:04 Base Model Results and Early Fine-Tunes
3:13 The Problem: Video models arenโt goodโฆ
Watch on YouTube โ
(saves to browser)
Chapters (16)
The Task: Generating Conan OโBrien interviewing Will Smith
2:04
Base Model Results and Early Fine-Tunes
3:13
The Problem: Video models arenโt good a multi person generations
6:50
Can we just prompt Nano Banana instead of fine-tuning
9:43
Why fine-tune?
11:17
What could a higher quality production pipeline look like?
14:50
Step 1: Masking
16:04
Enter DinoV3
21:28
Fine-tuning Qwen-Image-Edit to fill in masked images
26:12
Implementing our Wan 2.2 Comfyui Workflow
28:13
Questions
31:40
Tweaking our Comfyui flow
36:05
Moment of truth! Final generation
36:54
Question
38:15
Implementing our Qwen-Image-Edit LoRA in Comfyui
43:24
Conclusion
DeepCamp AI