You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

bycloud · Beginner ·📄 Research Papers Explained ·5y ago

Key Takeaways

This video teaches how to create deepfakes using only audio inputs with tools like PCAVS and speech synthesis models

Full Transcript

deep fake has come a pretty long way it started from stitching people's faces onto another person's face then came along the process of animating a still face with just a video and now you can just simply use audio to create basic lip sync i have a pretty horrific video about the lip sync ai wave 2 lip but i thank you not to check it out because it's so cringe to make the above evolution chart easier to understand the bakamitai meme stands around in the middle it was hilarious in its own way because of how badly the faces were animated but what if it's decently animated i'll probably ruin the fun but i'll show you later in the video so in this episode of beating a dead meme let me introduce you to pc avs short for post controllable audio visual system it's like a first order motion model and wave to lib had a baby not only it can lip sync any videos with just audio listen i will be more ready than i was in 2012 because i will have done my job it can also copy the head movements from another video too the flexibility of this ai is just incredible and this is because this ai model is built upon three main parts first is the input identity which is the part that focuses on generating and manipulating the face you choose as an input this part controls how the face looks like and how consistent it will be when animated and moved around to put it in perspective fom has a worse identity generation consistency when the face moves too much this is what caused the face to deform in the meme while pc avs generates new inconsistent facial details with limited information from the input but there will be a consistency problem on regenerating the faces even though it's a pretty good solution the second part is basically the information about input posts you want to use as a reference whether the face looks up or down this part will make sure to transfer the head movements aka the pose onto the input identity the last part uses the audio spectrogram and synchronizes it with the visual features which are the lips combining all three of these parts a much more consistent and flexible face animation is then born with respect to the three parts three files are then required to use as an input an audio a target face and a pose you can use it in the intended way like on their official demo or you could turn something into a speaking image by making the target face and the pose both the same thing or just slightly create some movements to make the speech look a bit more natural this offers a lot of possibilities and functionalities unlike wave to lip which can only make the lips movements in sync with the audio or unlike fom which can only move around the facial features that may make it look slightly awkward the only functional wise downside is that currently you cannot just use a video without an audio to generate a talking face like how fom can just animate other faces with its driving video since the audio is the basis of this ai it means that the lips movements are based on the audio spectrograms so you need a clear speaking audio to make the lips move as intended this means i will need to manually sing bakamitai myself if i want to reproduce the memo this is some really fun stuff even though the output quality is pretty low unfortunately we won't be able to increase it for now as it depends on the identity generation part so basically the resolution is stuck here because of the architecture of the ai model if you want to try it yourself i'll link my tutorial down in the description it's a pretty messy setup as the original codes were all meant for the linux system but if you are patient enough you can give a try or you can just work with today's sponsor 27 stars to set up and run these ais for you or for your business 27 stars is a linden based development company that creates custom tailored web and mobile applications for individuals or businesses of all sizes they are really experienced nice and friendly people to work with and they are also providing an exclusive 10 discount for all of you guys if you choose to work with them all you have to do is to include my name in the initial email to receive the discount and by working with them you are also indirectly supporting me too which allows me to dedicate more of my time to work on these fun videos thank you so much for watching also a big shout out to md and many other patreons that support my work through patreon you can share your generated results over on my discord channel or if you have any questions feel free to put it there follow my twitter if you haven't and i'll see you in the next one

Original Description

Today's Sponsor is 27 Stars, Develop your own AI software right now! Check it out here: https://27stars.co.uk/bycloud Include "bycloud" in the initial email to receive a 10% discount for all your purchases. You Only Need Audio To Deepfake, but that doesn't mean it'll look good 👀 hahaha- Anyways, high hopes for this though. They only need to mostly improve the identity regeneration. Otherwise, this technique looks very promising!! PC-AVS [Paper] https://arxiv.org/abs/2104.11116 [Official GitHub] https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS [Tutorial GitHub] https://github.com/bycloudai/PCAVS-Windows [Installation Tutorial] https://youtu.be/4O3EqIiEzKQ Wav2lip [My video] https://youtu.be/dQw4w9WgXcQ [GitHub] https://github.com/Rudrabha/Wav2Lip [Paper] http://arxiv.org/abs/2008.10010 First Order Motion Model [My video] https://youtu.be/B_qWUVi52yY [GitHub] https://github.com/AliaksandrSiarohin/first-order-model [Paper] https://arxiv.org/pdf/2104.11280.pdf This video is supported by the kind Patrons: 🙏Emdy, Mazen Alotaibi, Sascha Henrichs, Jake Disco, Demilson Quintao, Martin Schmitt, Zeldus Zumbain Support me on Patreon if you hope to see more: https://www.patreon.com/bycloud or by becoming a member instead (same perks!): https://www.youtube.com/channel/UCgfe2ooZD3VJPB6aJAnuQng/join video credits: Ctrl Shift Face [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Music] Steaminwaffles - The Walk Home [Profile Art] https://twitter.com/pygm7
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from bycloud · bycloud · 42 of 60

1 Can Deepfake work on Anime?
Can Deepfake work on Anime?
bycloud
2 AI that Can Copy Voices
AI that Can Copy Voices
bycloud
3 Live Action Is Terrible So AI Turned It Back Into Anime
Live Action Is Terrible So AI Turned It Back Into Anime
bycloud
4 2 AIs Enhance Anime to 4K 240FPS, but is it good?
2 AIs Enhance Anime to 4K 240FPS, but is it good?
bycloud
5 IRL to Anime With Cartoonization AI
IRL to Anime With Cartoonization AI
bycloud
6 How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
bycloud
7 AI Makes Any Images Cinematic [3D Photo Inpainting]
AI Makes Any Images Cinematic [3D Photo Inpainting]
bycloud
8 AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
bycloud
9 Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
bycloud
10 AI Generates New Light Source for Images [PaintingLight]
AI Generates New Light Source for Images [PaintingLight]
bycloud
11 Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
bycloud
12 Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
bycloud
13 AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
bycloud
14 AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
bycloud
15 This AI Reconstruct Real Life Objects From Just Images [NeRF]
This AI Reconstruct Real Life Objects From Just Images [NeRF]
bycloud
16 Image Restoration AI - Upscale and Restore Faces with DFDNet
Image Restoration AI - Upscale and Restore Faces with DFDNet
bycloud
17 Best Image Colorization AI 2020
Best Image Colorization AI 2020
bycloud
18 Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
bycloud
19 Deepfake With Audio Only [Wav2Lip]
Deepfake With Audio Only [Wav2Lip]
bycloud
20 Copy IRL, Paste on your PC [AR Cut & Paste]
Copy IRL, Paste on your PC [AR Cut & Paste]
bycloud
21 This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
bycloud
22 This AI Restores Old Photos with Damages Automatically!
This AI Restores Old Photos with Damages Automatically!
bycloud
23 Anime Filter with AI - Snapchat vs. TikTok
Anime Filter with AI - Snapchat vs. TikTok
bycloud
24 AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
bycloud
25 AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
bycloud
26 AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
bycloud
27 AI Sky Replacement with SkyAR
AI Sky Replacement with SkyAR
bycloud
28 Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
bycloud
29 AI That Paints Anything Stroke By Stroke
AI That Paints Anything Stroke By Stroke
bycloud
30 What Happens When AI Robots Design Themselves
What Happens When AI Robots Design Themselves
bycloud
31 Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
bycloud
32 ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
bycloud
33 AI Transform any Image into Sketch or Line Art [ArtLine]
AI Transform any Image into Sketch or Line Art [ArtLine]
bycloud
34 AI That Could Soon Replace Vector Artists [DALL-E]
AI That Could Soon Replace Vector Artists [DALL-E]
bycloud
35 Photoshop Detector AI Is Useless
Photoshop Detector AI Is Useless
bycloud
36 The Future Of Online Shopping
The Future Of Online Shopping
bycloud
37 How The Future of Image Search Would Look Like
How The Future of Image Search Would Look Like
bycloud
38 Everyone Can Make 3D Animations Easily Now! [Monster Mash]
Everyone Can Make 3D Animations Easily Now! [Monster Mash]
bycloud
39 3D Video Stabilization with AI [NSFF]
3D Video Stabilization with AI [NSFF]
bycloud
40 OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
bycloud
41 You Describe & AI Photoshops Faces For You [StyleCLIP]
You Describe & AI Photoshops Faces For You [StyleCLIP]
bycloud
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
bycloud
43 This AI Transfers Anime Back Into Sketch [Anime2Sketch]
This AI Transfers Anime Back Into Sketch [Anime2Sketch]
bycloud
44 AI Learns To Play CS:GO By Watching Humans Play!
AI Learns To Play CS:GO By Watching Humans Play!
bycloud
45 How AI Fixes The Horrendous CR7 Statue
How AI Fixes The Horrendous CR7 Statue
bycloud
46 Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
bycloud
47 Face Enhance AI Restores Extremely Blurry Faces [GPEN]
Face Enhance AI Restores Extremely Blurry Faces [GPEN]
bycloud
48 AI That Only Needs 1 Image To Deepfake [SimSwap]
AI That Only Needs 1 Image To Deepfake [SimSwap]
bycloud
49 The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
bycloud
50 StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
bycloud
51 AI generated art goes brrrrr [VQGAN+CLIP]
AI generated art goes brrrrr [VQGAN+CLIP]
bycloud
52 AI That Doodles Any Given Description
AI That Doodles Any Given Description
bycloud
53 Best AI Motion Capture 2021 - OpenPose vs DeepMotion
Best AI Motion Capture 2021 - OpenPose vs DeepMotion
bycloud
54 Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
bycloud
55 This Video's Voice Is Entirely Made From Audio Deepfake
This Video's Voice Is Entirely Made From Audio Deepfake
bycloud
56 I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
bycloud
57 Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
bycloud
58 How I Deepfaked VTuber Gawr Gura with AI
How I Deepfaked VTuber Gawr Gura with AI
bycloud
59 AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
bycloud
60 I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
bycloud

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics
Medium · AI
ICMI 2026 Reviews [D]
Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it
Reddit r/MachineLearning
Up next
Beyond Big Vendors: ERP Systems Explained #shorts
Digital Transformation with Eric Kimberling
Watch →