This Video's Voice Is Entirely Made From Audio Deepfake

bycloud · Intermediate ·📄 Research Papers Explained ·4y ago

Key Takeaways

The video discusses the creation of an entirely AI-generated voice using a research paper called TalkNet, which can reproduce singing by mimicking tones from another person with a reference audio. The video also touches on the potential of AI voice synthesis and its implications, as well as the release of a new speech synthesis AI by Facebook AI.

Full Transcript

a while ago adobe had a really promising voice manipulation software in the making and it was definitely way ahead of its time just take a look at this and uh i kissed my dogs and my wife okay so let's uh let's do something here okay copy paste let's do it copy taste oh yeah it's done let's listen to it and uh i kissed my wife and my wife oops but for some odd reason there were no official news releases or even announcements made after this extremely concerning public reveal fast forward to now you are listening to a video that's entirely made from ai vocal synthesis for reference this is my normal voice the i simply cloned my voice by training with my audios and transcripts from a few of my older videos and now this ai can actually speak normally and even sing for me which would typically be impossible to achieve i can become an m some of them do assuming i'm a human i gotta do together through your superhuman innovative and i made a problem so that anything you say is saying of immediately to you and that's anything more than after demonstrating under given [ __ ] he's a feeling by of a dating never evading and i hope the haters are forever waiting for the day that they can afford to be celebrated because another way to get them out of it make up evaded music you make elevator music sing some slick japanese or just be some dumb hoppy pasta what the [ __ ] did you just [ __ ] say about me little [ __ ] which is also pretty fun and scary at the same time and these are all done through this research paper called talk net talk net came out in mid 2020 and it is able to reproduce singing by mimicking tones from another person with a reference audio while it can still perform without a reference audio too but it doesn't sound so good so my ai vocal cord is capable of singing without damaging any ears now but this singing thing isn't perfect either notably when some part of the singing that queer is dragging on vocals my voice goes everywhere this could be because there isn't any singing in the data set or just simply because the audio reference for the ai to copy the style of was not clean enough and don't have enough training data so just imagine how well if we have a really well trained ai model it would be crazy i definitely struggle the most when it comes to making the ai pronounce while or dragon will this is because the current ai talking net uses a type of pronouncing notation called our puppet our puppet is a set of phonetic transcription codes so it tells you how an english word is spoken in american english since this ai model is trained in english as long as i make the alphabet similar to the pronunciation of other languages i can make the ai sing in japanese too so this is how harumachi is possible this actually also applies to singing as some syllables are separated instead of spoken like a word so you will see some really odd combination of inputs to produce the actual singing like pronunciation you can see in the comparison here and yes in dropping my first ever song covered by my ai vocal cord check out the full song link in the description i also put the input lyrics in the actual lyrics side by side too as it is sometimes really funny what input i choose in order to get the ai to sing some words accurately well these voice cloning ais were already really believable if you played it through a phone call a few years ago having this clear audio is definitely going to break the internet and it can definitely be clearer than what you are hearing right now so i'm pretty worried for my grandparents meanwhile singing still has a long way to go due to some actual ethical concerns with the peers i collaborated with for this video i would not be publicly teaching people how to use this ai but shout out to justin john for helping me with making this video and if you really want to learn it you can join my discord and ask there on the other hand as of september 9th facebook ai just released a jaw-dropping speech synthesis ai that does not require a transcript to train and it sounds even more real than this current voice it has some really impressive results but the codes are not released yet here have a listen when an aristocracy carries on the public affairs it's verification there may be in science and from the last video i talked about how you can deep fake videos in real time and it has definitely troubled a lot of people even though the current votes cloning can't really perform live but if you look closely at the adobe video it's literally a real time voice editing demo performed on a stage in 2016. 2016 is five years ago by the way let that sink in it's literally way ahead of its time but let's not fear the technology instead we should be prepared and strengthened against it early on because the future will always inevitably come and sometimes there is no point struggling against it so if you want to learn more about ai today's sponsor skillshare has the right place for use skillshare is an online learning community with thousands of inspiring classes for creators you can freely explore new skills deep in existing passions and have fun with your creativity right now you probably have a lot of questions on what exactly is ai machine learning or whatever this mystical and futuristic technology is fear not there is actually this class called demystifying artificial intelligence understanding machine learning by christian heilman which provides a really great introduction on the topic of ai the lessons aren't that long either so you can easily go through them during your free time what's even better is that they are currently also providing a limited time offer of one month free premium trial instead of the usual two weeks which provides you plenty of time to go through these short lessons and even if you are done with that class you can also check out their other more in-depth machine learning and ai courses or other amazing ad-free and high-quality creative classes like photography illustrations and video editing the first 1000 people to click the link in the description will get a one month free trial of skillshare so you can start exploring your creativity today lastly thank you for watching a big shout out to andrew and many other patreons and members that support my work through patreon and youtube if you have any questions feel free to join my discord too follow my twitter if you haven't and i'll see you all in the next one [Music]

Original Description

The first 1,000 people to use this link will get a 1 month free trial of Skillshare: https://skl.sh/bycloud09212 This video is entirely made from my AI vocal cord. Pretty cool ain't it? It's kind of a shame that the dataset is small and didn't sound as good as expected. But something even more impressive is sneaking up from Facebook AI and I can feel it. This research paper is slightly older, but it's still super impressive that I gotta make a video about it, so here it is! Sorry I would not be including a video on how to run this AI, because the people that I worked with in this video had a condition that they'll help me as long as I don't publicly demonstrate how to run this. But if you really want to know how, feel free to join my discord at: https://dsc.gg/bycloud Check out my AI vocal cords here: [Goodbye Sengen Cover] https://youtu.be/5rEQfzds-WY [Harumachi Clover Cover] https://youtu.be/Ht0-fqzrMHA TalkNET Fully-Convolutional Non-Autoregressive Speech Synthesis Model [Paper] https://arxiv.org/abs/2005.05514 [GitHub] https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/tts/models/talknet.py This video is supported by the kind Patrons: 🙏Andrew Lescelius, Sascha Henrichs, Jake Disco, Demilson Quintao, Tony Jimenez, dicefist Support me on Patreon if you hope to see more: https://www.patreon.com/bycloud or by becoming a member instead (same perks!): https://www.youtube.com/channel/UCgfe2ooZD3VJPB6aJAnuQng/join [Discord] https://discord.gg/NhJZGtH [Twitter] https://twitter.com/bycloudai [Patreon] https://www.patreon.com/bycloud [Music] Steaminwaffles - The Walk Home [Profile Art] https://twitter.com/pygm7
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from bycloud · bycloud · 55 of 60

1 Can Deepfake work on Anime?
Can Deepfake work on Anime?
bycloud
2 AI that Can Copy Voices
AI that Can Copy Voices
bycloud
3 Live Action Is Terrible So AI Turned It Back Into Anime
Live Action Is Terrible So AI Turned It Back Into Anime
bycloud
4 2 AIs Enhance Anime to 4K 240FPS, but is it good?
2 AIs Enhance Anime to 4K 240FPS, but is it good?
bycloud
5 IRL to Anime With Cartoonization AI
IRL to Anime With Cartoonization AI
bycloud
6 How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
How Does AI Generated Songs Sound Like? [OpenAI Jukebox]
bycloud
7 AI Makes Any Images Cinematic [3D Photo Inpainting]
AI Makes Any Images Cinematic [3D Photo Inpainting]
bycloud
8 AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
AI Generates Anime Faces, And It's Getting Even Better [StyleGAN2]
bycloud
9 Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
Tech Behind The Meme: Dame Da Ne AI - Single Image Deepfake
bycloud
10 AI Generates New Light Source for Images [PaintingLight]
AI Generates New Light Source for Images [PaintingLight]
bycloud
11 Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
Depixelizing Doom Guy? Mona Lisa in Real Life? The "Upscaling" AI: PULSE
bycloud
12 Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
Image Completion AI - Predict Pixels Just Like Text Predictions [Image-GPT]
bycloud
13 AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
AI Generates 3D Human Model from 2D Image [PIFuHD - FacebookAI]
bycloud
14 AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
AI Assisted Masking - Save Your Precious Time Right Now [AE Rotobrush 2]
bycloud
15 This AI Reconstruct Real Life Objects From Just Images [NeRF]
This AI Reconstruct Real Life Objects From Just Images [NeRF]
bycloud
16 Image Restoration AI - Upscale and Restore Faces with DFDNet
Image Restoration AI - Upscale and Restore Faces with DFDNet
bycloud
17 Best Image Colorization AI 2020
Best Image Colorization AI 2020
bycloud
18 Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
Image Decomposition AI - Edit Highlights and Textures Easily [Appearance Eraser]
bycloud
19 Deepfake With Audio Only [Wav2Lip]
Deepfake With Audio Only [Wav2Lip]
bycloud
20 Copy IRL, Paste on your PC [AR Cut & Paste]
Copy IRL, Paste on your PC [AR Cut & Paste]
bycloud
21 This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
This AI Transform Faces into Hyper-Realistic Cartoon Characters [Toonify]
bycloud
22 This AI Restores Old Photos with Damages Automatically!
This AI Restores Old Photos with Damages Automatically!
bycloud
23 Anime Filter with AI - Snapchat vs. TikTok
Anime Filter with AI - Snapchat vs. TikTok
bycloud
24 AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
AI Reduces Bandwidth Problems for Video Calls [NVIDIA Maxine]
bycloud
25 AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
AI Motion Capture - Track Your Hands & Body WITHOUT Bodysuit [FrankMocap]
bycloud
26 AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
AI Converts Cartoon Characters To Real Life [Pixel2Style2Pixel]
bycloud
27 AI Sky Replacement with SkyAR
AI Sky Replacement with SkyAR
bycloud
28 Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
Better Than DAIN? NEW BEST Tool for Boosting Video's FPS with AI [RIFE/Flowframes]
bycloud
29 AI That Paints Anything Stroke By Stroke
AI That Paints Anything Stroke By Stroke
bycloud
30 What Happens When AI Robots Design Themselves
What Happens When AI Robots Design Themselves
bycloud
31 Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
Deepfake Movements with 1 image ONLY [Liquid Warping GAN]
bycloud
32 ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
ANYTHING can be a "Green Screen" Now [Real-Time High-Resolution Background Matting]
bycloud
33 AI Transform any Image into Sketch or Line Art [ArtLine]
AI Transform any Image into Sketch or Line Art [ArtLine]
bycloud
34 AI That Could Soon Replace Vector Artists [DALL-E]
AI That Could Soon Replace Vector Artists [DALL-E]
bycloud
35 Photoshop Detector AI Is Useless
Photoshop Detector AI Is Useless
bycloud
36 The Future Of Online Shopping
The Future Of Online Shopping
bycloud
37 How The Future of Image Search Would Look Like
How The Future of Image Search Would Look Like
bycloud
38 Everyone Can Make 3D Animations Easily Now! [Monster Mash]
Everyone Can Make 3D Animations Easily Now! [Monster Mash]
bycloud
39 3D Video Stabilization with AI [NSFF]
3D Video Stabilization with AI [NSFF]
bycloud
40 OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
OpenAI’s Sarcastic Chat Bot [GPT-3 API Beta]
bycloud
41 You Describe & AI Photoshops Faces For You [StyleCLIP]
You Describe & AI Photoshops Faces For You [StyleCLIP]
bycloud
42 You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
You Only Need Audio To Deepfake Now! Might look slightly cursed tho [PCAVS]
bycloud
43 This AI Transfers Anime Back Into Sketch [Anime2Sketch]
This AI Transfers Anime Back Into Sketch [Anime2Sketch]
bycloud
44 AI Learns To Play CS:GO By Watching Humans Play!
AI Learns To Play CS:GO By Watching Humans Play!
bycloud
45 How AI Fixes The Horrendous CR7 Statue
How AI Fixes The Horrendous CR7 Statue
bycloud
46 Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
Best Vocal Isolation & Instrumental Extraction 2021 [lalal.ai vs Spleeter]
bycloud
47 Face Enhance AI Restores Extremely Blurry Faces [GPEN]
Face Enhance AI Restores Extremely Blurry Faces [GPEN]
bycloud
48 AI That Only Needs 1 Image To Deepfake [SimSwap]
AI That Only Needs 1 Image To Deepfake [SimSwap]
bycloud
49 The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
The Amazing AI Behind the TikTok JoJo Pose Challenge [BoostMonocularDepth + 3DP]
bycloud
50 StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
StyleGAN3!? - What AI Actually Sees When Generating Faces [Alias-Free GAN]
bycloud
51 AI generated art goes brrrrr [VQGAN+CLIP]
AI generated art goes brrrrr [VQGAN+CLIP]
bycloud
52 AI That Doodles Any Given Description
AI That Doodles Any Given Description
bycloud
53 Best AI Motion Capture 2021 - OpenPose vs DeepMotion
Best AI Motion Capture 2021 - OpenPose vs DeepMotion
bycloud
54 Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
Anime Image Enhance AI Has Gone To The Next Level [Real-ESRGAN]
bycloud
This Video's Voice Is Entirely Made From Audio Deepfake
This Video's Voice Is Entirely Made From Audio Deepfake
bycloud
56 I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
I Can’t Sing So I Cloned My Voice w/ AI To Cover Goodbye Sengen (English Cover)
bycloud
57 Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
Best Background Removal - AIs Removes BG Without Green Screen And It's Amazing. [RVM]
bycloud
58 How I Deepfaked VTuber Gawr Gura with AI
How I Deepfaked VTuber Gawr Gura with AI
bycloud
59 AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
AI Magic Removal - Removes ANYTHING & Inpaints For You [LaMa]
bycloud
60 I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
I Did NOT Expect AI Anime Filter To Be This Good [AnimeGANv2]
bycloud

The video teaches how to create an AI-generated voice using TalkNet and discusses the potential and implications of AI voice synthesis. It also introduces the concept of speech synthesis and its applications.

Key Takeaways
  1. Choose a research paper on AI voice synthesis, such as TalkNet
  2. Train a model using the paper's methodology and a dataset of audio and transcripts
  3. Test and refine the model to improve its performance
  4. Consider the ethical implications of AI voice synthesis and develop strategies for responsible development
💡 AI voice synthesis has the potential to revolutionize the way we interact with technology, but it also raises important ethical concerns that must be addressed.

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way
Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics
Medium · AI
ICMI 2026 Reviews [D]
Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances
Reddit r/MachineLearning
Workshop submission for main conference paper under review [D]
Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV
Reddit r/MachineLearning
Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]
Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it
Reddit r/MachineLearning
Up next
Beyond Big Vendors: ERP Systems Explained #shorts
Digital Transformation with Eric Kimberling
Watch →