Exploring alternative interactions in JavaScript

Chrome for Developers · Intermediate ·🧠 Large Language Models ·1y ago

Key Takeaways

The video explores alternative interactions in JavaScript using Web AI, including motion-controlled applications, eye movement interfaces, and brain-computer interfaces, leveraging tools like TensorFlow, Electron, and Web APIs.

Full Transcript

[Music] sorry Hi everyone uh thank you so much for coming back after after the break I hope that you enjoyed uh lunch and uh yeah let's get started to uh talk about exploring alternative interactions in JavaScript so my name is Charlie Gerard I'm a senior research engineer and uh what I'm going to talk about today is not at all what I do in my uh in my day-to-day job I work in cyber security but I love exploring what can be done with the web and with JavaScript uh on the side so I know that with the current advancement in AI we're talking a lot about llms and having uh usually text as input and text as output but I want to talk about the different types of inputs that we can work with when we are uh working with the web so if you're a front-end engineer you might uh know some of the inputs I'm going to talk about but I want to basically think about how can we interact with the web and with web applications but in a different way than just using the keyboard and uh the mouse so the first input that I want to talk about is webcam data and in general you know if you're a JavaScript engineer you you know you know that you can access the webcam from the laptop or from your phone or an external webcam plugged into your device and you can use the get user media web API and usually you get the live stream data from the webcam and you can display it and have you know um you create a web application you can have calls and things like that but you can also use that data with models from tensorflow GS to create most motion controlled applications so I'm going to show a few different you know examples of that and first of all we can use models that allow you to do Post detection so using the live feed from the webcam you can actually then kind of feed that into a machine learning model and have access to certain key points around the body and uh the first example that I'm going to show you in a minute is uh so I'm I'm actually glad that you can't see what the video is yet uh because I created um clone of a bead saber game so if you've never played beats saber in the past it's usually something that you play on a VR headset uh that can be quite expensive and you have joysticks in your hand and you kind of like smash some beats um and the thing is I didn't want to have to buy a VR headset so I was thinking okay I know that you can do Post detection with torl GS I can write JavaScript I know that you can do 3D in the browser so I wanted to recreate a game where instead of having joysticks I would play with my hands so if we could just uh roll the video it's about a minute um and you'll kind of see what I'm talking [Music] [Applause] [Music] [Applause] about if we could maybe pause the video I think I yeah thank you um so here in this example it was a version that I built in like 2019 so you know it's quite a while ago now and it was with one of the first versions of the post detection model and since then I think it's been improved a little bit so in this demo I was going kind of like slow but when I updated it to one of the new models I don't have a video for that but I was able to play the game in like hard mode so going a lot faster uh and it was super fun so I was able to basically create an experiment or web a web experiment where you can instead of having to have thousands of dollars to buy a web VR headset you can actually have the same experience uh in the browser and it was just projected on a wall you know for like effects and stuff but it was just running from my laptop and with the same like for this particular example I was using basically the key the key points for my wrists but then you can also use key points for the rest of the body using the exact same model and that's not a video that's just an image but for some of you who might have watched uh squid game on Netflix so the series is about you know a lot of people are in this big room and there's this doll that you know turns it its head and when it's not looking you're supposed to run until the other side of the room and if the doll turns his head and you're moving you die basically so uh and so you know I was thinking oh it would have been awesome for Netflix to kind of create an experiment where you can do that at home a little bit and uh so I didn't build the the interface uh itself is I didn't build it but then uh the person who had just I forgot his name I apologize but the um the the interaction was just I think moving the mouse and I thought wellow you could do it you can actually integrate that with the PostNet model and you can do you know in your living room um and what was interesting with that is that then you start thinking about okay so if I want to build something like this what kind of key points do I want to use is it just the wrist and then the ankles and what does it mean to not move is there a threshold like a Delta between my movements between before after so it's kind of like fun even in terms of like um thinking about that in terms of engineering and uh so yeah I don't have a video but you can imagine I was running in my living room and you know basically as soon as the doll was turning I was like and trying not to um so this is like for POS detection so for key points around the entire body but you can go a bit more granular and go into hand detection so if you only want to focus on the movements of the hand or the fingers there's a lot of key points that you have access to as well and for this particular example uh I think if we could play it in the background I think I can talk over it so I was thinking about um how what would it be like to design interfaces with my hand movement so it was like I think when I was a kid I really liked uh Minority Report but in the movie they had gloves it wasn't really using cameras so here it was just an experiment using again uh tensor flows and the Hand detection model and I kind of like hooked that in like a figma plugin and I was basically just dragging different layers and like experimenting with different gestures okay what would be like to zoom or to yeah drag layers and obviously it was kind of like a minimal prototype because and it's not I'm not pretending that that would replace using The Mouse and the keyboard but if you started to think about how to maybe merge different typ types of interactions and if you could you know create interfaces with just movements what would it look like would different fingers be you know different actions what kind of movements and things like that so that's kind of what I like to um to to think about when uh when I use these like machine learning models so here it's for hand detection but we can even go more grer and think about gaze detection so they had this a very good um machine learning model you know available with tensor fls as well that can really uh check the movements of your eyes so uh if we could also play in the background please because I think I can explain so here it was a prototype of a gaze control keyboard that was actually a web- based prototype that was uh replicating an experiment built by Google for Android it was an application for people who have um um motion who are restricted in restricted in their in their uh ability to to move so they have an interface that they can control with their eyes in this example uh I just use the direction left and right with eyes so if you only have access to two different types of inputs I could have added looking up or looking down but I thought okay you know I just want to use left and right what would that look like and then you end up having a keyboard interface that split into two and you end up kind of having like almost like a binary search where with a letter that you know you want to type is in a section and then it just you know split it into two until you have one letter um left so um yeah that's an example something that you can do with gas detection but then with what I learned uh building this I ended up thinking what would it be like to to write code with my eyes so if we could also play it in the background thank you um so then I kind of moved on and it's like there's because it's JavaScript there's an electron app with the keyboard interface and then the camera and then it's like talking to vs code plugin to write code and it's like so and what's interesting in this as well it's not only the code that you write to have this working but you start thinking as well well okay when I write code there's only a certain amount of things that I can do right here it was to create a very basic react app and when you use react if you've used react before okay there's the concept of creating a variable or a component or things like that and there's a very limited you can't just write you know free text um I mean maybe with LMS now you can but in this particular example with vs code you already have plugins that have access to Snippets and instead of writing I would select the Snippets with uh with my eyes so again I'm not pretending that in a few years we'll all write code with our eyes but you could be thinking about okay on top of just having you know having access to I can type I can scroll maybe your eyes would be doing something else uh you know like something else that you would trigger in vs code so this is like just only using the get user media web API with webcome data you can already do all of that stuff and that's only the things I've built but there's a lot more examples out there but there's something else that you have access to with a get user media uh yeah get user media API is audio data and we don't really think that much about audio data when we think about machine learning but you know usually again if you're a frontend engineer you have access to the microphone from the device that you're using you could get user media and you uh you know have the audio Boolean that you said to true and you can also feed that into uh tensorflow GS models and you can have sound controlled apps so what do I mean by that there was a research paper around acoustic activity recognition that was I think by uh Kani Millan University again that prototype was from 2019 as well so quite a few years ago now but it's basically using sound data to be able to classify different activities and that research was originally to improve uh home iot devices or systems so instead of having to buy an iot coffee machine an iot fridge an iot toaster and all the things that are very expensive you could have one device that listens to the sound that these appliances make for example when the toaster is done I'm sure if you have a toaster you can imagine what what it sounds like or the sound of like your uh door of the fridge you know when you open and close you know I can hear in my head but uh so you have these um these appliances make certain noises that you can train a machine on model to recognize and then you can have a single system that kind of knows a little bit about your about your house but it can be used for other things in this particular GI here it was just the difference between me speaking at the beginning and then the sound of claps and it was me experimenting with if you have if you get this row data from the microphone and you visualize it through spectrogram I was thinking okay if I can visual see the difference of like the signature between these sounds then a maion on model will probably also um see the difference so here it was my particular like my specific prototype built uh with web Technologies but this uh technology is actually also used I think by a Google uh project around uh trying to stop illegal deforestation where there's Android phones in trees that listen to the sound of chainsaws and then they uh it alerts Rangers you know when there's you know illegal deforestation happening so we never really think about audio data that much but there's actually a lot that you can do and uh another example then I moved on to uh oh I forgot I had this one uh so if we I don't know okay I'm going to explain and then I'm going to uh play the video because this sound so this was built in 20120 when I was watching the it was like a Apple keynote where they released the model on the watch that was um having a counter for 20 seconds for people you know for for them to wash their hands and as I was watching their the conference I realized well actually I think I know how to build that so in just two hours on my couch watching the the conference I rebuilt a web based version um so maybe if we could just play a few seconds I don't think it's super long um I'm just going to talk over it so um here you can see the sink and uh yeah in just a couple hours I built uh an interface on the web that you know listens to the sound of water and then it starts and stop uh a countdown and because it's built in JavaScript you know my laptop is right there but you can also work on the phone or on a tablet anything that can really um run JavaScript so again instead of buying a very expensive watch you can use the laptop you already have and the JavaScript you know I mean that's pretty cool um and I still don't have an Apple Watch because now I have that um so that's just like one example where to me uh using machine learning in JavaScript it's like you can build so many things uh even on your own I don't have a team and I don't have expensive stuff uh but then like just one last example with audio is then I moved on that was more recent when I was reading another research paper that was looking at onface interactions and that is basically using if you have your earphones that have a microphone in it and you can actually record the sound that your hand your fingers make when you touch your head I don't know you can do it now if you want but when you tap or when you slide it makes a different noise you can hear it um and you know these gestures would have a different signature in audio data and then you can use that to train a machine learning model where if you have your phones that don't have touch on it but you still want to be able to have you know interactions then you could be touching your face instead I'm not saying that you know this would actually really happen but it's possible so that's that's the whole point and here I'm just basically I had like a tap gesture um to scroll and I think if I was swiping then it was going back up or things like that so it's just experimenting and you know I if you don't have access if you don't work in Academia and you have access to a team you can still uh experiment with different interactions um with the knowledge that that you have uh okay so we talked about video data we talked about audio data and uh there's a third one that I want to talk about is Hardware data and what I mean by that is that again if you work on the web uh you can actually use a lot of devices that you can plug into uh a web interface you have the web Bluetooth API the web USB API and the web serial API and if you get live data from these devices you can use it I mean you can build your own model or there's also um I think the platform teachable machine or no it wasn't I forgot but there was one uh platform that had um that was built to interact with adrino data but uh yeah and as out of that you can actually build motion controlled applications so one thing that I built again in 2019 I feel like my life stopped in 2019 but um so it was a an experiment that I built to create my own model where I was holding a piece of uh data in my hands and I recreated like an air Street Fighter game so this does not use a camera I could have been anywhere uh in my living room and uh I I recorded on over multi weekends I recorded live data of me doing three different gestures and I then created a a model a machine learning model using tensor fls to recognize you know I think it was a punch uh hukin and you can I probably say it terribly but um and then it was basically with websockets streaming that data back to uh to the browser with this game here so it was a prototype the second character is not implemented uh but I was able to build this uh this prototype like three different ways there was live data from the phone using the the generic sensor web API I built a version using the Daydream controller that probably doesn't exist anymore it was the web VR headset by Google and then another version with a custommade uino controller with an Aloma and gyroscope so you know even if you don't have uino you don't know how to do your own Electronics but you have a phone there is a web API where you have access to live data from gyroscope and aerometer and using that live data you can create your own model and then you know have some fun um and another I think is last demo that I'm going to show uh so this this one does not use tensorflow GS in the particular demo I just couldn't find the video where I did build something using tensorflow GS but you can use brain sensor um devices that you buy you know online they're commercially available and you can have access to live data from your brain activity and then use that with tensor flows as well to create your own models and create uh applications so if we could just play the video it's like yeah 30 seconds um so basically What's Happening Here is that I trained um there was so this was not using tensorflow GS but it could be rebuilt using tensorflow GS where I trained um a model to recognize the patterns in my head when I was thinking about certain thoughts so this one was jumping and uh then you know training that model to be able to recognize that from live data and then you know streaming it back to the browser and being able to play the dyo game so you know I was I looked like I you know I was frozen because I was thinking so much so you know it's like you thinking about like tapping like you know tapping my foot on the the floor and then you know making the dyno uh jump and uh the the video that I couldn't find of the the Prototype that I built was to use live row data to build a machine learning model to uh recognize when I was blinking my eyes so a lot of the models that we have sometimes are using the camera but with the camera you have to have a good lighting you have to be at a certain distance and using a brain sensor you can be anywhere in a room and have the sensors across the forehead you can get live data from that and when you're blinking there's like a spike in the data and you can create your own machine learning model uh with that there's more experiments that I wanted to build but I just uh you know didn't have the time but on on this that was basically everything I wanted to show today and my point was that wait wait I'm done done done so the point was just that there's a lot more that you can do uh with tensu and with machine learning on the web that we are than what we're really talking about right now and it's been available for years there's probably a lot more that you can do um but I hope that you've learned something and that maybe you'll have uh fun experimenting with it as well but thank you so much [Music]

Original Description

Meet Charlie Gerard as she showcases a bunch of amazing projects she has made using Web AI including reading your own brain waves! The latest advancements in AI have mainly focused on large language models and new ways of creating and consuming content. However, AI also offers the opportunity to rethink the way we interact with interfaces. Using JavaScript and models focused on body tracking or audio classification, web developers have a unique opportunity to experiment with alternative interactions to create more innovative web experiences. Connect with Charlie → https://goo.gle/3Vh74ej Watch more Web AI talks → https://goo.gle/web-ai Speaker: AI for the web, Google Chrome Browser, Chrome Browser Automation, Chrome Extensions, Chrome, Chrome Web Platform, Web AI, Web apps, Web Assembly (Wasm), Web Platform in Chrome, WebAssembly for Chrome, WebGPU, Generative AI, AI, Google AI, Google AI Edge, Responsible AI, TensorFlow,Hugging Face Models
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Chrome for Developers · Chrome for Developers · 0 of 60

← Previous Next →
1 Polymer Performance Patterns (The Polymer Summit 2015)
Polymer Performance Patterns (The Polymer Summit 2015)
Chrome for Developers
2 Polymer Power Tools (The Polymer Summit 2015)
Polymer Power Tools (The Polymer Summit 2015)
Chrome for Developers
3 Chrome Dev Summit 2014 – Chrome Case Studies
Chrome Dev Summit 2014 – Chrome Case Studies
Chrome for Developers
4 Web Directions Code 2015 round up
Web Directions Code 2015 round up
Chrome for Developers
5 Maintainable Code - HTTP203
Maintainable Code - HTTP203
Chrome for Developers
6 iron-ajax… wat?! -- Polycasts #26
iron-ajax… wat?! -- Polycasts #26
Chrome for Developers
7 The Guardian - Supercharged
The Guardian - Supercharged
Chrome for Developers
8 ES2015 (next version of JavaScript), Totally Tooling Tips (S2 Ep1)
ES2015 (next version of JavaScript), Totally Tooling Tips (S2 Ep1)
Chrome for Developers
9 #AskPolymer: Rob answers all the questions ever -- Polycasts #27
#AskPolymer: Rob answers all the questions ever -- Polycasts #27
Chrome for Developers
10 The Future of JavaScript - HTTP203
The Future of JavaScript - HTTP203
Chrome for Developers
11 Data Binding 101 -- Polycasts #28
Data Binding 101 -- Polycasts #28
Chrome for Developers
12 The Guardian part 2 - Supercharged
The Guardian part 2 - Supercharged
Chrome for Developers
13 The Future of Web Audio: with Chris Wilson and Chris Lowis
The Future of Web Audio: with Chris Wilson and Chris Lowis
Chrome for Developers
14 Chrome 46: New motion-path animations, client hints and service worker improvements
Chrome 46: New motion-path animations, client hints and service worker improvements
Chrome for Developers
15 Sublime Snippets, Totally Tooling Tips (S2 Ep2)
Sublime Snippets, Totally Tooling Tips (S2 Ep2)
Chrome for Developers
16 #AskPolymer: How do you make the show? -- Polycasts #29
#AskPolymer: How do you make the show? -- Polycasts #29
Chrome for Developers
17 Critical Path CSS, Totally Tooling Tips (S2 Mini Tip #1)
Critical Path CSS, Totally Tooling Tips (S2 Mini Tip #1)
Chrome for Developers
18 Binding to Objects -- Polycasts #30
Binding to Objects -- Polycasts #30
Chrome for Developers
19 Player FM - Supercharged
Player FM - Supercharged
Chrome for Developers
20 Where’s the Designer? #AskPolymer -- Polycasts #31
Where’s the Designer? #AskPolymer -- Polycasts #31
Chrome for Developers
21 Jake Beats Wikipedia - HTTP203
Jake Beats Wikipedia - HTTP203
Chrome for Developers
22 Supercharged Observers! -- Polycasts #32
Supercharged Observers! -- Polycasts #32
Chrome for Developers
23 Jai's Web blog - Supercharged
Jai's Web blog - Supercharged
Chrome for Developers
24 Windows Command-line Tooling, Totally Tooling Tips (S2, Ep4)
Windows Command-line Tooling, Totally Tooling Tips (S2, Ep4)
Chrome for Developers
25 What about internationalization? #AskPolymer -- Polycasts #33
What about internationalization? #AskPolymer -- Polycasts #33
Chrome for Developers
26 Developing for Billions (Chrome Dev Summit 2015)
Developing for Billions (Chrome Dev Summit 2015)
Chrome for Developers
27 Google+ Performance Improvement Comparison
Google+ Performance Improvement Comparison
Chrome for Developers
28 Deploying HTTPS: The Green Lock and Beyond (Chrome Dev Summit 2015)
Deploying HTTPS: The Green Lock and Beyond (Chrome Dev Summit 2015)
Chrome for Developers
29 Progressive Web Apps (Chrome Dev Summit 2015)
Progressive Web Apps (Chrome Dev Summit 2015)
Chrome for Developers
30 Instant Loading with Service Workers (Chrome Dev Summit 2015)
Instant Loading with Service Workers (Chrome Dev Summit 2015)
Chrome for Developers
31 Increase Engagement with Web Push Notifications (Chrome Dev Summit 2015)
Increase Engagement with Web Push Notifications (Chrome Dev Summit 2015)
Chrome for Developers
32 Engaging with the Real World: Web Bluetooth and Physical Web (Chrome Dev Summit 2015)
Engaging with the Real World: Web Bluetooth and Physical Web (Chrome Dev Summit 2015)
Chrome for Developers
33 Asking for Permission: respectful, opinionated UI (Chrome Dev Summit 2015)
Asking for Permission: respectful, opinionated UI (Chrome Dev Summit 2015)
Chrome for Developers
34 Polymer - State of the Union (Chrome Dev Summit 2015)
Polymer - State of the Union (Chrome Dev Summit 2015)
Chrome for Developers
35 Building Progressive Web Apps with Polymer (Chrome Dev Summit 2015)
Building Progressive Web Apps with Polymer (Chrome Dev Summit 2015)
Chrome for Developers
36 Introduction to RAIL (Chrome Dev Summit 2015)
Introduction to RAIL (Chrome Dev Summit 2015)
Chrome for Developers
37 DevTools in 2015: Authoring to the max (Chrome Dev Summit 2015)
DevTools in 2015: Authoring to the max (Chrome Dev Summit 2015)
Chrome for Developers
38 RAIL in the real world (Chrome Dev Summit 2015)
RAIL in the real world (Chrome Dev Summit 2015)
Chrome for Developers
39 #ChromeDevSummit talks are up - W00T! -- Polycast #34
#ChromeDevSummit talks are up - W00T! -- Polycast #34
Chrome for Developers
40 V8 Performance from the Driver's Seat (Chrome Dev Summit 2015)
V8 Performance from the Driver's Seat (Chrome Dev Summit 2015)
Chrome for Developers
41 Quantify and improve real-world RAIL (Chrome Dev Summit 2015)
Quantify and improve real-world RAIL (Chrome Dev Summit 2015)
Chrome for Developers
42 Owning your performance: RAIL (Chrome Dev Summit 2015)
Owning your performance: RAIL (Chrome Dev Summit 2015)
Chrome for Developers
43 HTTP/2 101 (Chrome Dev Summit 2015)
HTTP/2 101 (Chrome Dev Summit 2015)
Chrome for Developers
44 Leadership Panel (Chrome Dev Summit 2015)
Leadership Panel (Chrome Dev Summit 2015)
Chrome for Developers
45 Build Processes, Totally Tooling Tips (S2, Ep 5)
Build Processes, Totally Tooling Tips (S2, Ep 5)
Chrome for Developers
46 Accessibility (Chrome Dev Summit 2015)
Accessibility (Chrome Dev Summit 2015)
Chrome for Developers
47 Binding to Arrays -- Polycasts #35
Binding to Arrays -- Polycasts #35
Chrome for Developers
48 HTTP2 - HTTP203
HTTP2 - HTTP203
Chrome for Developers
49 Chrome 47: Splash Screens, requestIdleCallback and better desktop notifications (New in Chrome)
Chrome 47: Splash Screens, requestIdleCallback and better desktop notifications (New in Chrome)
Chrome for Developers
50 Call For Submissions - Supercharged
Call For Submissions - Supercharged
Chrome for Developers
51 Cross Device Testing, Totally Tooling Tips (S2 Ep6)
Cross Device Testing, Totally Tooling Tips (S2 Ep6)
Chrome for Developers
52 Testing AJAX with Web Component Tester -- Polycasts #37
Testing AJAX with Web Component Tester -- Polycasts #37
Chrome for Developers
53 Slack: Extended Xmas Special - Supercharged
Slack: Extended Xmas Special - Supercharged
Chrome for Developers
54 Browser testing with Travis & Sauce Labs -- Polycasts #38
Browser testing with Travis & Sauce Labs -- Polycasts #38
Chrome for Developers
55 Optimize for production with Vulcanize -- Polycasts #39
Optimize for production with Vulcanize -- Polycasts #39
Chrome for Developers
56 Highlights from Chrome Dev Summit 2015
Highlights from Chrome Dev Summit 2015
Chrome for Developers
57 Chrome 48: Custom buttons in notifications, DevTools Security panel, and Presentation mode
Chrome 48: Custom buttons in notifications, DevTools Security panel, and Presentation mode
Chrome for Developers
58 Crisper: Protecting your Polymer app with CSP -- Polycasts #40
Crisper: Protecting your Polymer app with CSP -- Polycasts #40
Chrome for Developers
59 How do I use Sass with Polymer? #AskPolymer -- Polycasts #41
How do I use Sass with Polymer? #AskPolymer -- Polycasts #41
Chrome for Developers
60 Colors – DevTools Tonight #0 (Pilot)
Colors – DevTools Tonight #0 (Pilot)
Chrome for Developers

The video showcases alternative interactions in JavaScript using Web AI, including motion-controlled applications, eye movement interfaces, and brain-computer interfaces. It covers the use of various tools and technologies, such as TensorFlow, Electron, and Web APIs, to build innovative applications. By following the steps and using the mentioned tools, developers can create their own interactive applications.

Key Takeaways
  1. Build a web application using webcam data and machine learning models
  2. Configure post detection models for motion-controlled applications
  3. Deploy a web experiment for a VR-like experience
  4. Train a machine learning model to recognize sounds from appliances
  5. Use spectrogram to visualize audio data
  6. Build a web-based version of a 20-second hand-washing counter using JavaScript
  7. Train a model to recognize brain activity using a brain sensor and tensor flow
💡 The video demonstrates the potential of Web AI in creating innovative and interactive applications, highlighting the importance of exploring alternative interactions in JavaScript.

Related AI Lessons

Building LSTMs with PyTorch and Lightning AI Part 7: Resuming Training with Checkpoints
Learn to resume LSTM training with checkpoints using PyTorch and Lightning AI, enabling efficient model iteration and development
Dev.to · Rijul Rajesh
How AI Learns with Less Labeled Data
Learn how AI can learn with less labeled data, a crucial aspect of machine learning beyond model selection
Medium · AI
Comparing Sarvam-30B and Qwen2.5–14B on Spider Text-to-SQL: An Active-Parameter Perspective
Learn how to compare large language models like Sarvam-30B and Qwen2.5-14B on the Spider Text-to-SQL benchmark from an active-parameter perspective
Medium · LLM
Debugging Benchmark: DeepSeek V4 Pro vs MiMo V2.5 Pro
Compare the debugging capabilities of DeepSeek V4 Pro and MiMo V2.5 Pro on a real-world GitHub bug
Dev.to · Stanislav
Up next
5 Levels of AI Agents - From Simple LLM Calls to Multi-Agent Systems
Dave Ebbelaar (LLM Eng)
Watch →