You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
Skills:
Generative CV90%
Key Takeaways
This video teaches how to create deepfakes using only audio inputs with tools like PCAVS and speech synthesis models
Full Transcript
deep fake has come a pretty long way it started from stitching people's faces onto another person's face then came along the process of animating a still face with just a video and now you can just simply use audio to create basic lip sync i have a pretty horrific video about the lip sync ai wave 2 lip but i thank you not to check it out because it's so cringe to make the above evolution chart easier to understand the bakamitai meme stands around in the middle it was hilarious in its own way because of how badly the faces were animated but what if it's decently animated i'll probably ruin the fun but i'll show you later in the video so in this episode of beating a dead meme let me introduce you to pc avs short for post controllable audio visual system it's like a first order motion model and wave to lib had a baby not only it can lip sync any videos with just audio listen i will be more ready than i was in 2012 because i will have done my job it can also copy the head movements from another video too the flexibility of this ai is just incredible and this is because this ai model is built upon three main parts first is the input identity which is the part that focuses on generating and manipulating the face you choose as an input this part controls how the face looks like and how consistent it will be when animated and moved around to put it in perspective fom has a worse identity generation consistency when the face moves too much this is what caused the face to deform in the meme while pc avs generates new inconsistent facial details with limited information from the input but there will be a consistency problem on regenerating the faces even though it's a pretty good solution the second part is basically the information about input posts you want to use as a reference whether the face looks up or down this part will make sure to transfer the head movements aka the pose onto the input identity the last part uses the audio spectrogram and synchronizes it with the visual features which are the lips combining all three of these parts a much more consistent and flexible face animation is then born with respect to the three parts three files are then required to use as an input an audio a target face and a pose you can use it in the intended way like on their official demo or you could turn something into a speaking image by making the target face and the pose both the same thing or just slightly create some movements to make the speech look a bit more natural this offers a lot of possibilities and functionalities unlike wave to lip which can only make the lips movements in sync with the audio or unlike fom which can only move around the facial features that may make it look slightly awkward the only functional wise downside is that currently you cannot just use a video without an audio to generate a talking face like how fom can just animate other faces with its driving video since the audio is the basis of this ai it means that the lips movements are based on the audio spectrograms so you need a clear speaking audio to make the lips move as intended this means i will need to manually sing bakamitai myself if i want to reproduce the memo this is some really fun stuff even though the output quality is pretty low unfortunately we won't be able to increase it for now as it depends on the identity generation part so basically the resolution is stuck here because of the architecture of the ai model if you want to try it yourself i'll link my tutorial down in the description it's a pretty messy setup as the original codes were all meant for the linux system but if you are patient enough you can give a try or you can just work with today's sponsor 27 stars to set up and run these ais for you or for your business 27 stars is a linden based development company that creates custom tailored web and mobile applications for individuals or businesses of all sizes they are really experienced nice and friendly people to work with and they are also providing an exclusive 10 discount for all of you guys if you choose to work with them all you have to do is to include my name in the initial email to receive the discount and by working with them you are also indirectly supporting me too which allows me to dedicate more of my time to work on these fun videos thank you so much for watching also a big shout out to md and many other patreons that support my work through patreon you can share your generated results over on my discord channel or if you have any questions feel free to put it there follow my twitter if you haven't and i'll see you in the next one
Original Description
Today's Sponsor is 27 Stars,
Develop your own AI software right now!
Check it out here: https://27stars.co.uk/bycloud
Include "bycloud" in the initial email to receive a 10% discount for all your purchases.
You Only Need Audio To Deepfake, but that doesn't mean it'll look good 👀 hahaha-
Anyways, high hopes for this though. They only need to mostly improve the identity regeneration. Otherwise, this technique looks very promising!!
PC-AVS
[Paper] https://arxiv.org/abs/2104.11116
[Official GitHub] https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS
[Tutorial GitHub] https://github.com/bycloudai/PCAVS-Windows
[Installation Tutorial] https://youtu.be/4O3EqIiEzKQ
Wav2lip
[My video] https://youtu.be/dQw4w9WgXcQ
[GitHub] https://github.com/Rudrabha/Wav2Lip
[Paper] http://arxiv.org/abs/2008.10010
First Order Motion Model
[My video] https://youtu.be/B_qWUVi52yY
[GitHub] https://github.com/AliaksandrSiarohin/first-order-model
[Paper] https://arxiv.org/pdf/2104.11280.pdf
This video is supported by the kind Patrons:
🙏Emdy, Mazen Alotaibi, Sascha Henrichs, Jake Disco, Demilson Quintao, Martin Schmitt, Zeldus Zumbain
Support me on Patreon if you hope to see more:
https://www.patreon.com/bycloud
or by becoming a member instead (same perks!):
https://www.youtube.com/channel/UCgfe2ooZD3VJPB6aJAnuQng/join
video credits:
Ctrl Shift Face
[Discord] https://discord.gg/NhJZGtH
[Twitter] https://twitter.com/bycloudai
[Patreon] https://www.patreon.com/bycloud
[Music] Steaminwaffles - The Walk Home
[Profile Art] https://twitter.com/pygm7
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from bycloud · bycloud · 42 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
▶
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Can Deepfake work on Anime?
bycloud
AI that Can Copy Voices
bycloud
Live Action Is Terrible So AI Turned It Back Into Anime
bycloud
2 AIs Enhance Anime to 4K 240FPS, but is it good?
bycloud
IRL to Anime With Cartoonization AI
bycloud
How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
bycloud
AI Makes Any Images Cinematic [3D Photo Inpainting]
bycloud
AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
bycloud
Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
bycloud
AI Generates New Light Source for Images [PaintingLight]
bycloud
Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
bycloud
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
bycloud
AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
bycloud
AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
bycloud
This AI Reconstruct Real Life Objects From Just Images [NeRF]
bycloud
Image Restoration AI - Upscale and Restore Faces with DFDNet
bycloud
Best Image Colorization AI 2020
bycloud
Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
bycloud
Deepfake With Audio Only [Wav2Lip]
bycloud
Copy IRL, Paste on your PC [AR Cut & Paste]
bycloud
This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
bycloud
This AI Restores Old Photos with Damages Automatically!
bycloud
Anime Filter with AI - Snapchat vs. TikTok
bycloud
AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
bycloud
AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
bycloud
AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
bycloud
AI Sky Replacement with SkyAR
bycloud
Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
bycloud
AI That Paints Anything Stroke By Stroke
bycloud
What Happens When AI Robots Design Themselves
bycloud
Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
bycloud
ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
bycloud
AI Transform any Image into Sketch or Line Art [ArtLine]
bycloud
AI That Could Soon Replace Vector Artists [DALL-E]
bycloud
Photoshop Detector AI Is Useless
bycloud
The Future Of Online Shopping
bycloud
How The Future of Image Search Would Look Like
bycloud
Everyone Can Make 3D Animations Easily Now! [Monster Mash]
bycloud
3D Video Stabilization with AI [NSFF]
bycloud
OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
bycloud
You Describe & AI Photoshops Faces For You [StyleCLIP]
bycloud
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
bycloud
This AI Transfers Anime Back Into Sketch [Anime2Sketch]
bycloud
AI Learns To Play CS:GO By Watching Humans Play!
bycloud
How AI Fixes The Horrendous CR7 Statue
bycloud
Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
bycloud
Face Enhance AI Restores Extremely Blurry Faces [GPEN]
bycloud
AI That Only Needs 1 Image To Deepfake [SimSwap]
bycloud
The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
bycloud
StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
bycloud
AI generated art goes brrrrr [VQGAN+CLIP]
bycloud
AI That Doodles Any Given Description
bycloud
Best AI Motion Capture 2021 - OpenPose vs DeepMotion
bycloud
Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
bycloud
This Video's Voice Is Entirely Made From Audio Deepfake
bycloud
I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
bycloud
Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
bycloud
How I Deepfaked VTuber Gawr Gura with AI
bycloud
AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
bycloud
I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
bycloud
More on: Generative CV
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Medium · AI
ICMI 2026 Reviews [D]
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Reddit r/MachineLearning
🎓
Tutor Explanation
DeepCamp AI