You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

bycloud · Beginner ·📄 Research Papers Explained ·5y ago

Skills: Generative CV90%

Key Takeaways

This video teaches how to create deepfakes using only audio inputs with tools like PCAVS and speech synthesis models

Full Transcript

deep fake has come a pretty long way it started from stitching people's faces onto another person's face then came along the process of animating a still face with just a video and now you can just simply use audio to create basic lip sync i have a pretty horrific video about the lip sync ai wave 2 lip but i thank you not to check it out because it's so cringe to make the above evolution chart easier to understand the bakamitai meme stands around in the middle it was hilarious in its own way because of how badly the faces were animated but what if it's decently animated i'll probably ruin the fun but i'll show you later in the video so in this episode of beating a dead meme let me introduce you to pc avs short for post controllable audio visual system it's like a first order motion model and wave to lib had a baby not only it can lip sync any videos with just audio listen i will be more ready than i was in 2012 because i will have done my job it can also copy the head movements from another video too the flexibility of this ai is just incredible and this is because this ai model is built upon three main parts first is the input identity which is the part that focuses on generating and manipulating the face you choose as an input this part controls how the face looks like and how consistent it will be when animated and moved around to put it in perspective fom has a worse identity generation consistency when the face moves too much this is what caused the face to deform in the meme while pc avs generates new inconsistent facial details with limited information from the input but there will be a consistency problem on regenerating the faces even though it's a pretty good solution the second part is basically the information about input posts you want to use as a reference whether the face looks up or down this part will make sure to transfer the head movements aka the pose onto the input identity the last part uses the audio spectrogram and synchronizes it with the visual features which are the lips combining all three of these parts a much more consistent and flexible face animation is then born with respect to the three parts three files are then required to use as an input an audio a target face and a pose you can use it in the intended way like on their official demo or you could turn something into a speaking image by making the target face and the pose both the same thing or just slightly create some movements to make the speech look a bit more natural this offers a lot of possibilities and functionalities unlike wave to lip which can only make the lips movements in sync with the audio or unlike fom which can only move around the facial features that may make it look slightly awkward the only functional wise downside is that currently you cannot just use a video without an audio to generate a talking face like how fom can just animate other faces with its driving video since the audio is the basis of this ai it means that the lips movements are based on the audio spectrograms so you need a clear speaking audio to make the lips move as intended this means i will need to manually sing bakamitai myself if i want to reproduce the memo this is some really fun stuff even though the output quality is pretty low unfortunately we won't be able to increase it for now as it depends on the identity generation part so basically the resolution is stuck here because of the architecture of the ai model if you want to try it yourself i'll link my tutorial down in the description it's a pretty messy setup as the original codes were all meant for the linux system but if you are patient enough you can give a try or you can just work with today's sponsor 27 stars to set up and run these ais for you or for your business 27 stars is a linden based development company that creates custom tailored web and mobile applications for individuals or businesses of all sizes they are really experienced nice and friendly people to work with and they are also providing an exclusive 10 discount for all of you guys if you choose to work with them all you have to do is to include my name in the initial email to receive the discount and by working with them you are also indirectly supporting me too which allows me to dedicate more of my time to work on these fun videos thank you so much for watching also a big shout out to md and many other patreons that support my work through patreon you can share your generated results over on my discord channel or if you have any questions feel free to put it there follow my twitter if you haven't and i'll see you in the next one

Original Description

Today's Sponsor is 27 Stars, Develop your own AI software right now! Check it out here: https://27stars.co.uk/bycloud Include "bycloud" in the initial email to receive a 10% discount for all your purchases. You Only Need Audio To Deepfake, but that doesn't mean it'll look good 👀 hahaha- Anyways, high hopes for this though. They only need to mostly improve the identity regeneration. Otherwise, this technique looks very promising!! PC-AVS [Paper] https://arxiv.org/abs/2104.11116 [Official GitHub] https://github.com/Hangz-nju-cuhk/Talking-Face_PC-AVS [Tutorial GitHub] https://github.com/bycloudai/PCAVS-Windows [Installation Tutorial] https://youtu.be/4O3EqIiEzKQ Wav2lip [My video] https://youtu.be/dQw4w9WgXcQ [GitHub] https://github.com/Rudrabha/Wav2Lip [Paper] http://arxiv.org/abs/2008.10010 First Order Motion Model [My video] https://youtu.be/B_qWUVi52yY [GitHub] https://github.com/AliaksandrSiarohin/first-order-model [Paper] https://arxiv.org/pdf/2104.11280.pdf This video is supported by the kind Patrons: 🙏Emdy, Mazen Alotaibi, Sascha Henrichs, Jake Disco, Demilson Quintao, Martin Schmitt, Zeldus Zumbain Support me on Patreon if you hope to see more: https://www.patreon.com/bycloud or by becoming a member instead (same perks!): https://www.youtube.com/channel/UCgfe2ooZD3VJPB6aJAnuQng/join video credits: Ctrl Shift Face [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Music] Steaminwaffles - The Walk Home [Profile Art] https://twitter.com/pygm7

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from bycloud · bycloud · 42 of 60

← Previous Next →

Can Deepfake work on Anime?

Can Deepfake work on Anime?

AI that Can Copy Voices

AI that Can Copy Voices

Live Action Is Terrible So AI Turned It Back Into Anime

Live Action Is Terrible So AI Turned It Back Into Anime

2 AIs Enhance Anime to 4K 240FPS, but is it good?

2 AIs Enhance Anime to 4K 240FPS, but is it good?

IRL to Anime With Cartoonization AI

IRL to Anime With Cartoonization AI

How Does AI Generated Songs Sound Like? [OpenAI Jukebox]

How Does AI Generated Songs Sound Like? [OpenAI Jukebox]

AI Makes Any Images Cinematic [3D Photo Inpainting]

AI Makes Any Images Cinematic [3D Photo Inpainting]

AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]

AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]

Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake

Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake

AI Generates New Light Source for Images [PaintingLight]

AI Generates New Light Source for Images [PaintingLight]

Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE

Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE

Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]

AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]

AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]

AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]

AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]

This AI Reconstruct Real Life Objects From Just Images [NeRF]

This AI Reconstruct Real Life Objects From Just Images [NeRF]

Image Restoration AI - Upscale and Restore Faces with DFDNet

Image Restoration AI - Upscale and Restore Faces with DFDNet

Best Image Colorization AI 2020

Best Image Colorization AI 2020

Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]

Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]

Deepfake With Audio Only [Wav2Lip]

Deepfake With Audio Only [Wav2Lip]

Copy IRL, Paste on your PC [AR Cut & Paste]

Copy IRL, Paste on your PC [AR Cut & Paste]

This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]

This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]

This AI Restores Old Photos with Damages Automatically!

This AI Restores Old Photos with Damages Automatically!

Anime Filter with AI - Snapchat vs. TikTok

Anime Filter with AI - Snapchat vs. TikTok

AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]

AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]

AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]

AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]

AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]

AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]

AI Sky Replacement with SkyAR

AI Sky Replacement with SkyAR

Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]

Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]

AI That Paints Anything Stroke By Stroke

AI That Paints Anything Stroke By Stroke

What Happens When AI Robots Design Themselves

What Happens When AI Robots Design Themselves

Deepfake Movements with 1 image ONLY [Liquid Warping GAN]

Deepfake Movements with 1 image ONLY [Liquid Warping GAN]

ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]

ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]

AI Transform any Image into Sketch or Line Art [ArtLine]

AI Transform any Image into Sketch or Line Art [ArtLine]

AI That Could Soon Replace Vector Artists [DALL-E]

AI That Could Soon Replace Vector Artists [DALL-E]

Photoshop Detector AI Is Useless

Photoshop Detector AI Is Useless

The Future Of Online Shopping

The Future Of Online Shopping

How The Future of Image Search Would Look Like

How The Future of Image Search Would Look Like

Everyone Can Make 3D Animations Easily Now! [Monster Mash]

Everyone Can Make 3D Animations Easily Now! [Monster Mash]

3D Video Stabilization with AI [NSFF]

3D Video Stabilization with AI [NSFF]

OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]

OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]

You Describe & AI Photoshops Faces For You [StyleCLIP]

You Describe & AI Photoshops Faces For You [StyleCLIP]

You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]

This AI Transfers Anime Back Into Sketch [Anime2Sketch]

This AI Transfers Anime Back Into Sketch [Anime2Sketch]

AI Learns To Play CS:GO By Watching Humans Play!

AI Learns To Play CS:GO By Watching Humans Play!

How AI Fixes The Horrendous CR7 Statue

How AI Fixes The Horrendous CR7 Statue

Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]

Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]

Face Enhance AI Restores Extremely Blurry Faces [GPEN]

Face Enhance AI Restores Extremely Blurry Faces [GPEN]

AI That Only Needs 1 Image To Deepfake [SimSwap]

AI That Only Needs 1 Image To Deepfake [SimSwap]

The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]

The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]

StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]

StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]

AI generated art goes brrrrr [VQGAN+CLIP]

AI generated art goes brrrrr [VQGAN+CLIP]

AI That Doodles Any Given Description

AI That Doodles Any Given Description

Best AI Motion Capture 2021 - OpenPose vs DeepMotion

Best AI Motion Capture 2021 - OpenPose vs DeepMotion

Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]

Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]

This Video's Voice Is Entirely Made From Audio Deepfake

This Video's Voice Is Entirely Made From Audio Deepfake

I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)

I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)

Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]

Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]

How I Deepfaked VTuber Gawr Gura with AI

How I Deepfaked VTuber Gawr Gura with AI

AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]

AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]

I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]

I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]

More on: Generative CV

View skill →

Stable diffusion img2img tutorial.

Stable diffusion img2img tutorial.

Sebastian Kamph

How to install Deforum locally. Stable diffusion animation.

How to install Deforum locally. Stable diffusion animation.

Sebastian Kamph

Next level AI art Control | My workflow

Next level AI art Control | My workflow

Create an Immersive Experience for VR with Instant NeRF

Create an Immersive Experience for VR with Instant NeRF

NVIDIA Developer

Inpainting Tutorial - Stable Diffusion

Inpainting Tutorial - Stable Diffusion

Sebastian Kamph

How to use Stable Diffusion. Automatic1111 Tutorial

How to use Stable Diffusion. Automatic1111 Tutorial

Sebastian Kamph

Related Reads

On July 1, 2026, arXiv will spin out from Cornell University, its home for the past 25 years, to become an independent nonprofit organization. Major funding support from Simons Foundation and Schmidt Sciences. Ditching the red for their website. [N]

arXiv is becoming an independent nonprofit organization after 25 years at Cornell University, backed by major funding, which will impact the future of research and academia

Reddit r/MachineLearning

CS-NRRM™ Official Publications: Paper 1 and Paper 2 Are Now Available

Learn about the CS-NRRM's official publications on a 12-year longitudinal human observation archive and its significance in research and development

Medium · Data Science

Found a potential mistake in an ICLR 2026 blogpost [D]

Verify a potential mistake in an ICLR 2026 blog post and learn how to effectively report errors in academic publications

Reddit r/MachineLearning

Rebuttals Move Peer-Review Scores, but Initial-Review Structure Bounds the Movement

Learn how author rebuttals impact peer-review scores and the factors that influence their effectiveness in ICLR 2024-2025, using LLMs for measurement

Indians Under House Arrest in America? 😱 Immigration Crisis Explained | SumanTV Classroom

SumanTV Classroom