El Meme que entrenó a una IA | BITS

Dot CSV · Intermediate ·📰 AI News & Updates ·7y ago

Skills: CV Basics80%ML Pipelines60%

Key Takeaways

The video discusses Google's solution to predict the depth of a 2D image using a deep learning model trained on data from the 'mannequin challenge' trend, which provides a unique dataset for inferring three-dimensionality in scenes with moving cameras and people.

Full Transcript

[Music] Problem solving. Imagine you're looking at the following scene and I ask you, "Hey, can you tell me which elements are closer to you and which are further away?" Surely you won't have any problem telling me that this person here is closer than this building here, and that in between there's a path that recedes from the person to the building. No problem. In fact, I could ask you to color each point in the scene, assigning a different intensity to each point depending on its distance, resulting in an image like this: a depth map. Okay, how did you manage to do this? Or better yet, how can we get a computer to learn to do this? You know, to infer the three-dimensionality of a scene, we can opt for the biological option: the separation between your eyes. Stereoscopic vision allows you to observe a scene from two different viewpoints, whose images can be combined to triangulate the position of each of the objects. This way we obtain an interpretation of the scene's depth. Easy. And if we didn't have two cameras and if we were working with a monocular vision system, how many miles away do you think it would be? A trillion. Luckily, our friend Einstein would agree that we could use the temporal dimension as if it were another spatial dimension. That is, if we need two different perspectives of the same object and we only have one camera, we could move the camera over time to obtain captures from different angles. This is interesting because it's actually a simpler scheme, similar to what we encounter in our daily lives when we use, for example, our mobile phone camera. So, perfect, problem solved. That's all for today's video. Subscribe! And wait, there's a problem: the method we just discussed has to meet one more restriction: the objects in the scene we're observing must remain static over time as we move the camera. In other words, the points observed must be the same from here to here in order to correctly infer the three-dimensionality of the scene. Otherwise, the whole thing falls apart. Okay, this is a problem because, for example, how could we use such a system to infer depth in a shot where the protagonist is a person who, well, is usually in motion? Let's go back to the starting point because it's true that these two The methods I 've explained can be very useful for inferring three-dimensionality, but of course, when I initially showed you this image, you were able to solve it without stereoscopic vision. Even though you have two eyes, viewing it from your monitor is a flat, two-dimensional image, and the camera isn't moving since it's static. So how did you do it? The reality is that your life experience of observing the world with stereoscopic vision and learning the three-dimensional shape of each object allows you to learn a mental model that you can work with even in situations with little information. In other words, you already have prior knowledge encoded in your mind about how the world is structured in three dimensions and how objects are distributed in a scene—information that will be useful for solving this problem. So what we can do is train a Deep Link model to learn to encode this prior knowledge of how to estimate the depth of a scene with a person. Okay, okay, I know what you're thinking, Carlos. You 're confusing me; we're going around in circles. It's a vicious cycle. Because how do we train this deep learning system? We'll need data pairs so that for each video we have its equivalent depth map. And how do we get that map if the subject and the camera are moving and we can't triangulate? We need to use three-dimensional cameras like the Kinect. Well, it's an option, but these are usually limited to use in closed environments and would offer very little variety of environments to train our system. Oh my god, everything's wrong! At this point, I hope you understand the whole context surrounding the problem we 're dealing with. We're missing data, data that in this case would be scenes of people in varied environments from which we can infer their three-dimensionality to train our deep learning model. Well, pay attention because the answer given by a Google Research team is simply brilliant, accurate, and ingenious. So much so that last week it deserved an honorable mention at the prestigious VPR 2019 conference because what Google has done is connect the need of the problem with the answer, an answer they found by going back to the year 2016, remember the 'mannequin challenge'? If we jog our memories, we recall a time when the trend was to film oneself in increasingly absurd situations, with the camera moving around the three-dimensional space of a scene while everyone remained motionless, waiting to complete the desired dataset. Well, in this case, for them, just hanging around with the mannequin, moving slowly. But fast forward to 2019. This trend has been key to creating the perfect dataset to solve the problem we were discussing. It's available online, waiting to be used. You see Google's offices in this image? Well, up here is the office of the person who came up with this simply brilliant idea. With this data, a DeepLenin model has been trained, which, in combination with other techniques like optical flow computing, is capable of accurately predicting depth maps with moving cameras and people. This could have direct applications in augmented reality tools, similar to what Apple has achieved. Its third version of Léger Kit, which also manages to estimate depth and segment users in real time, is leaving us with numerous examples of quite spectacular augmented reality applications, both in the background of 2000 videos of people performing the mannequin challenge and the trained models have been made available to the public ( both links at the address). However, from this video, I want you to take away the moral: in a sector where data collection is one of the most costly phases in terms of time and resources, ingenuity is the element that can give you the ideal solution to your problem, something worth valuing, unless, of course, all this were orchestrated by Google, that the marketing challenge was a challenge launched by the company in 2016 with the sole purpose of getting users unfamiliar with their actual task to work on creating this data cell. Do you really think this is possible? In that case, I recommend you go directly to this video here where we talk about 'challenge' data and where I explain some of the ingenious ways in which... Companies like Google take advantage of this to obtain your data without your knowledge. For my part, the only information I'm going to ask for is your feedback on whether you liked this video, and that you eagerly await the next video about artificial intelligence, which, as you know, you'll find here. So, USA, go!

Original Description

¿Sabes cuál ha sido la ingeniosa solución de Google para resolver el problema de predecir la profundidad de una imagen en 2D? --- DATOS Y MODELOS ENTRENADOS --- https://google.github.io/mannequinchallenge/www/index.html --- ¡MÁS DOTCSV! ---- 💸 Patreon : https://www.patreon.com/dotcsv 👓 Facebook : https://www.facebook.com/AI.dotCSV/ 👾 Twitch!!! : https://www.twitch.tv/dotcsv 🐥 Twitter : https://twitter.com/dotCSV 📸 Instagram : https://www.instagram.com/dotcsv/ --- ¡MI TECNOLOGÍA! ---- ** Aquí no está toda mi tecnología, sólo aquella que realmente recomiendo. Usando estos links de Amazon yo me llevaré una comisión por tu compra :) ** [Tecnología básica para Youtube] 💻 Portátil - MSI GP72 7RDX Leopard : https://amzn.to/2CDwvgY 📸 Cámara - Canon EOS 750D : https://amzn.to/2CDPqbi 👁‍🗨 Objetivo 1 - EF 50 mm, F/1.8 : https://amzn.to/2CH7npx 👁‍🗨 Objetivo 2 - EF-S 18-135mm : https://amzn.to/2DuhL5t 👁‍🗨 Objetivo 3 - EF 24 mm, F/2.8 : https://amzn.to/2AYAFQm 🎤 Microfono - Blue Yeti Micro : https://amzn.to/2RItA0I 💡 Foco Luz - Foco LED Neewer : https://amzn.to/2AYCM6K 🌈 Luz Color - Tira ALED Light : https://amzn.to/2B2iY2l [Mis otros cacharros] 📱 Smartphone - Google Pixel 2 XL : https://amzn.to/2RMuY2v -- ¡MÁS CIENCIA! --- 🔬 Este canal forma parte de la red de divulgación de SCENIO. Si quieres conocer otros fantásticos proyectos de divulgación entra aquí: http://scenio.es/colaboradores #Scenio

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Dot CSV · Dot CSV · 54 of 60

← Previous Next →

Lo que YA sabes sobre Inteligencia Artificial | DotCSV

Lo que YA sabes sobre Inteligencia Artificial | DotCSV

Evento #MadebyGoogle 2017 y la Inteligencia Artificial | DotCSV

Evento #MadebyGoogle 2017 y la Inteligencia Artificial | DotCSV

AlphaGo Zero, el nuevo gran hito de DeepMind! | DATA COFFEE #1

AlphaGo Zero, el nuevo gran hito de DeepMind! | DATA COFFEE #1

¿Qué es el Machine Learning?¿Y Deep Learning? Un mapa conceptual | DotCSV

¿Qué es el Machine Learning?¿Y Deep Learning? Un mapa conceptual | DotCSV

¿Por qué hay que temer DE VERDAD a la Inteligencia Artificial? | DATA COFFEE #2

¿Por qué hay que temer DE VERDAD a la Inteligencia Artificial? | DATA COFFEE #2

CapsNet : Un nuevo algoritmo de Deep Learning | DATA COFFEE #3

CapsNet : Un nuevo algoritmo de Deep Learning | DATA COFFEE #3

Modelos para entender una realidad caótica | DotCSV

Modelos para entender una realidad caótica | DotCSV

Creando caras artificiales con GANs mejoradas | DATA COFFEE #4

Creando caras artificiales con GANs mejoradas | DATA COFFEE #4

La Inteligencia Artificial de Google crea IA mejores que sus ingenieros: NASNet | DATA COFFEE #5

La Inteligencia Artificial de Google crea IA mejores que sus ingenieros: NASNet | DATA COFFEE #5

Regresión Lineal y Mínimos Cuadrados Ordinarios | DotCSV

Regresión Lineal y Mínimos Cuadrados Ordinarios | DotCSV

IA NOTEBOOK #1 | Regresión Lineal y Mínimos Cuadrados Ordinarios | Programando IA

IA NOTEBOOK #1 | Regresión Lineal y Mínimos Cuadrados Ordinarios | Programando IA

Los mejores avances en Inteligencia Artificial del 2017

Los mejores avances en Inteligencia Artificial del 2017

¿Cómo engañar a una RED NEURONAL? Ataques Adversarios | DATA COFFEE #6

¿Cómo engañar a una RED NEURONAL? Ataques Adversarios | DATA COFFEE #6

IA NOTEBOOK #2 | Ataques adversarios, cómo romper una RED NEURONAL | Programando IA

IA NOTEBOOK #2 | Ataques adversarios, cómo romper una RED NEURONAL | Programando IA

¿Qué es el Descenso del Gradiente? Algoritmo de Inteligencia Artificial | DotCSV

¿Qué es el Descenso del Gradiente? Algoritmo de Inteligencia Artificial | DotCSV

El Robot Sophia ¿Progreso o fraude? | DotCSV

El Robot Sophia ¿Progreso o fraude? | DotCSV

IA NOTEBOOK #3 | Descenso del Gradiente (Gradient Descent) | Programando IA

IA NOTEBOOK #3 | Descenso del Gradiente (Gradient Descent) | Programando IA

Q&A sobre Inteligencia Artificial - Especial DIRECTO 5000 subs! - #DotEnDirecto

Q&A sobre Inteligencia Artificial - Especial DIRECTO 5000 subs! - #DotEnDirecto

¿Qué es una Red Neuronal? Parte 1 : La Neurona | DotCSV

¿Qué es una Red Neuronal? Parte 1 : La Neurona | DotCSV

Noticias de Inteligencia Artificial - Marzo | ¡Nuevos vehículos autónomos!

Noticias de Inteligencia Artificial - Marzo | ¡Nuevos vehículos autónomos!

Q&A sobre Inteligencia Artificial y Youtube - DotCSV

Q&A sobre Inteligencia Artificial y Youtube - DotCSV

Noticias de Inteligencia Artificial - Abril | ¡Avances en la movilidad de bots!

Noticias de Inteligencia Artificial - Abril | ¡Avances en la movilidad de bots!

¿Qué es una Red Neuronal? Parte 2 : La Red | DotCSV

¿Qué es una Red Neuronal? Parte 2 : La Red | DotCSV

Noticias de Mayo y Q&A sobre Inteligencia Artificial - DotCSV

Noticias de Mayo y Q&A sobre Inteligencia Artificial - DotCSV

Jugando con Redes Neuronales - Parte 2.5 | DotCSV

Jugando con Redes Neuronales - Parte 2.5 | DotCSV

Noticias de Inteligencia Artificial - Junio | ¡Predicción de poses 3D con DensePose!

Noticias de Inteligencia Artificial - Junio | ¡Predicción de poses 3D con DensePose!

¿Qué demonios hago en Corea del Sur? - Deep Learning Camp Jeju 2018

¿Qué demonios hago en Corea del Sur? - Deep Learning Camp Jeju 2018

Noticias de Inteligencia Artificial - Jul. Ago. | ¡Brazos robóticos desarrollan destreza!

Noticias de Inteligencia Artificial - Jul. Ago. | ¡Brazos robóticos desarrollan destreza!

¿Qué es una Red Neuronal? Parte 3 : Backpropagation | DotCSV

¿Qué es una Red Neuronal? Parte 3 : Backpropagation | DotCSV

¿Qué es una Red Neuronal? Parte 3.5 : Las Matemáticas de Backpropagation | DotCSV

¿Qué es una Red Neuronal? Parte 3.5 : Las Matemáticas de Backpropagation | DotCSV

IA NOTEBOOK #4 | Programando Red Neuronal desde Cero! | Programando IA

IA NOTEBOOK #4 | Programando Red Neuronal desde Cero! | Programando IA

100 MOTIVOS por los que estudiar INFORMATICA | DotCSV

100 MOTIVOS por los que estudiar INFORMATICA | DotCSV

Directo Noviembre - Q&A de Inteligencia Artificial

Directo Noviembre - Q&A de Inteligencia Artificial

Noticias de Inteligencia Artificial - Sep. Oct. Nov. | ¡Imágenes realistas creadas artificialmente!

Noticias de Inteligencia Artificial - Sep. Oct. Nov. | ¡Imágenes realistas creadas artificialmente!

🕵 ¿TE ESCUCHA Google a través del móvil? - Análisis y Experimento

🕵 ¿TE ESCUCHA Google a través del móvil? - Análisis y Experimento

Experimento en Directo - ¿Nos escucha Google? | Preguntas y Respuestas IA - Directo Navideño

Experimento en Directo - ¿Nos escucha Google? | Preguntas y Respuestas IA - Directo Navideño

El 2018 ha sido ABURRIDO...

El 2018 ha sido ABURRIDO...

Las Redes Neuronales... ¿Aprenden o Memorizan? - Overfitting y Underfitting - Parte 1

Las Redes Neuronales... ¿Aprenden o Memorizan? - Overfitting y Underfitting - Parte 1

¿Qué hay detrás del #10YearChallenge? - Facebook, Datos y Captchas | DataCoffee #7

¿Qué hay detrás del #10YearChallenge? - Facebook, Datos y Captchas | DataCoffee #7

Montezuma's Revenge - ¿Hito del Aprendizaje Reforzado? | Data Coffee #8

Montezuma's Revenge - ¿Hito del Aprendizaje Reforzado? | Data Coffee #8

La Inteligencia Artificial No Debe Ver La Tele | BITS

La Inteligencia Artificial No Debe Ver La Tele | BITS

Noticias de Inteligencia Artificial - Dic. Ene. | ¡Caras artificiales hiperrealistas!

Noticias de Inteligencia Artificial - Dic. Ene. | ¡Caras artificiales hiperrealistas!

Directo Febrero - Q&A de Inteligencia Artificial

Directo Febrero - Q&A de Inteligencia Artificial

GPT-2 El Impresionante Generador de Texto Censurado | Data Coffee #9

GPT-2 El Impresionante Generador de Texto Censurado | Data Coffee #9

¿Qué veía Claude Monet mientras pintaba en 1873? - CycleGAN | BITS

¿Qué veía Claude Monet mientras pintaba en 1873? - CycleGAN | BITS

Cómo identificar el OVERFITTING en tu RED NEURONAL - Parte 2

Cómo identificar el OVERFITTING en tu RED NEURONAL - Parte 2

¿La Inteligencia Artificial que hacía TRAMPAS? | BITS 03

¿La Inteligencia Artificial que hacía TRAMPAS? | BITS 03

AlphaStar, la IA que domina el STARCRAFT II | Data Coffee #9

AlphaStar, la IA que domina el STARCRAFT II | Data Coffee #9

Noticias de Inteligencia Artificial - Feb. Mar. Abr | ¡Dibujo realista desde bocetos!

Noticias de Inteligencia Artificial - Feb. Mar. Abr | ¡Dibujo realista desde bocetos!

Las CRÍTICAS tras la victoria de AlphaStar - Data Coffee #10

Las CRÍTICAS tras la victoria de AlphaStar - Data Coffee #10

La IA que dio VIDA a la Mona Lisa - Living Portraits

La IA que dio VIDA a la Mona Lisa - Living Portraits

IA + Cuántica + Nanotecnología - DIRECTO feat. QuantumFracture & SizeMatters

IA + Cuántica + Nanotecnología - DIRECTO feat. QuantumFracture & SizeMatters

Aprende a PROGRAMAR una RED NEURONAL - Tensorflow, Keras, Sklearn

Aprende a PROGRAMAR una RED NEURONAL - Tensorflow, Keras, Sklearn

El Meme que entrenó a una IA | BITS

El Meme que entrenó a una IA | BITS

¿FaceApp te ROBA los datos?

¿FaceApp te ROBA los datos?

Así funciona DeepNUDE, la IA que te desnuda - (cGANs y Pix2Pix)

Así funciona DeepNUDE, la IA que te desnuda - (cGANs y Pix2Pix)

Generando FLORES realistas con IA - Pix2Pix | IA NOTEBOOK #5

Generando FLORES realistas con IA - Pix2Pix | IA NOTEBOOK #5

¿Por qué NO tenemos COCHES AUTÓNOMOS? - (TESLA vs WAYMO)

¿Por qué NO tenemos COCHES AUTÓNOMOS? - (TESLA vs WAYMO)

Directo Inteligencia Artificial y Coches Autónomos.

Directo Inteligencia Artificial y Coches Autónomos.

¿Por qué las GPUs son buenas para la IA? | Data Coffee #12

¿Por qué las GPUs son buenas para la IA? | Data Coffee #12

The video teaches how Google used the 'mannequin challenge' trend to create a dataset for training a deep learning model to predict depth in 2D images, and how this solution can be applied to augmented reality tools. The key takeaway is the importance of ingenuity in problem-solving, especially in data collection for AI research.

Key Takeaways

Understand the problem of predicting depth in 2D images
Learn about stereoscopic and monocular vision
Explore the 'mannequin challenge' dataset and its application to deep learning
Apply computer vision techniques to real-world problems
Stay updated on latest AI research and trends

💡 Ingenuity in problem-solving can lead to innovative solutions, such as using the 'mannequin challenge' trend to create a unique dataset for training a deep learning model.

🔒 Pro feature: Ask AI to explain this lesson →

More on: CV Basics

View skill →

Identify Horses or Humans with TensorFlow and Vertex AI

Building a Dog Breed Identifier App from scratch - DogNet

Building a Dog Breed Identifier App from scratch - DogNet

Aladdin Persson

Apply OpenGL Texturing and Camera Systems

Apply OpenGL Texturing and Camera Systems

Aerial Image Segmentation with PyTorch

Aerial Image Segmentation with PyTorch

How to Install Stable Diffusion - automatic1111

How to Install Stable Diffusion - automatic1111

Sebastian Kamph

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA RTXGI Unreal Engine 4 Plugin: Introduction and Setup

NVIDIA Developer

Related Reads

Meet Temporal Vixen: Date at Circuit City

Learn about Temporal Vixen's date at Circuit City and how AI can be used to recreate nostalgic experiences

How AI Is Fueling Anticipatory Anxiety At Work And What To Do About It

Learn how AI-driven anticipatory anxiety affects workers and strategies to overcome it

Forbes Innovation

The AI Problem That Was Never About AI

The AI problem is not about AI itself, but rather about understanding its limitations and applications

What If Your Surgical Stitches Could Tell You an Infection Is Coming?

Discover how AI-powered surgical stitches can detect infections early, revolutionizing patient care and outcomes

Tackling Malaria in Africa with Technology at the Huawei ICT Competition