Machine learning for Accessibility | Session
Key Takeaways
Explores three case studies on machine learning for Accessibility, including Voice Access, Lookout, and Live Transcribe
Full Transcript
[Music] hi everyone my name is tom hume i'm a product manager in google research i'm here today with my colleague scott and saga to talk about the intersection of machine learning and accessibility so around the world one billion people currently experience some form of disability which means that people with disabilities are using your app right now and in larger numbers than you might expect thanks to advances in machine learning today we can help people in ways that just weren't possible a few years ago so today we'll show you a few examples of this help tell you how we're thinking about accessibility and machine learning and give you a few hints for how you might apply it in your own apps and as it happens i'm the product manager for voice access so let's start there what's voice access it's an android app that lets you control your phone using your voice we designed it for and with people with manual dexterity challenges but we think it would benefit everyone who wants a hands-free experience once you've downloaded voice access from the play store and set it up you can give your phone simple commands commands like open maps search for stinson beach tap directions we launched a big redesign of voice access last year with a huge set of improvements we rethought the user interface we added some new capabilities but today i'd like to talk to you about how we use machine learning to make voice access easier and faster so let's step back for a second and remind ourselves how accessibility works on android when you're writing an app all the different components of your user interface have some text associated with them this text is used in different ways a screen reader for the blind might read it out voice access uses it to provide labels on screen and to understand commands if you're using a good user interface toolkit this text and some other important information is automatically set up but for some elements like photos and icons developers need to add the labels themselves and i'm sad to say that many developers don't add these labels or use toolkits that don't support accessibility well which means that important parts of their apps are just invisible to their users there have been some studies into this if you google inaccessible button disease you'll find a good one and sometimes even when developers put in the work they focus on screen readers and text that works there isn't so good for voice access but what if voice access could look at images on screen in the same way a sighted person would recognize icons and give them labels well this would have two benefits firstly where an app developer hasn't given an icon a label voice access could add one but also it would mean that users could refer to the same icon with the same name consistently across apps a classic three dot overflow icon for instance might be labeled menu by some apps overflow or options by others now all these apps are trying to do the right thing by their users but we shouldn't expect users to learn different names for the same icon in different apps so that's exactly what we did android r added a new screenshot api for accessibility services to use voice access uses this new api to take a screenshot and passes that screenshot into a machine learning model we call iconnet iconnet gives precise information about which icons are on screen and where giving them labels and then voice access takes those labels plus the ones an app has provided and uses them and here's my favorite part it does all of this locally without your screen ever leaving your android device we think this is a great use for machine learning to fill in some of the gaps in android apps for users with accessibility needs and to do so quickly and privately voice access detected about 30 icons when we launched we added another 40 in february and there's more intelligence to come so i'll end with a plea to developers please download voice access from the play store and use it to test your applications there are literally millions of people worldwide who have manual dexterity issues voice access gives them full use of their android device and your apps testing your app with voice access is good for these people and it's good for you you can download it at g dot co slash voice access and now my colleague scott is going to talk to you about lookout scott thanks tom i'm scott adams and i work in research as the product manager on lookout and i'm here to talk to you about a case study and ml applications for people with visual impairments lookout is an app that uses the smartphone's camera to recognize objects and text for users who are blind or low vision we use some cool technology that's available to you through mlkit apis but there's more to it than connecting a camera to a classifier especially in accessibility is critical to design with your users and not just for them there are two questions to ask one how well do i understand what the user needs and two how can i fit the technology to those needs at google this is codified in our ai principles such as be socially responsible and be built and tested for safety for example lookout went through proactive adversarial testing to guard against unfair bias this kind of sophisticated evaluation is critical but the path is much easier if you build with your users from the very beginning for instance lookout uses several different image classifiers here's how we used user feedback to tune for precision and recall as a refresher imagine we have a mix of apples and oranges and an apple detector if we have a high precision apple detector then the detector is taking no chances it's only going to say apple if it's certain i have an apple in my hand on the negative side if i have an apple that's let's say i'll be shaped or a different color than usual it's going to say nothing so we'll have false negatives where i do have an apple in my hand but the detector is silent contrast that with high recoil in this case the detector may be so sensitive that anything that is round and about the size of my hand is an apple so every time i show an apple it's saying apple every time that's terrific the downside is i might show it in orange and it's going to say apple and that's a false positive where it's saying apple but there is no apple in my hand and ideally we have both high precision and high recall but that's often not possible so how do we tune that well it may come down to the user if i really prefer an apple to an orange i may prefer a high precision detector so with that in mind here's what we learned from our users on lookout i'll talk about two cases one on currency and one on objects so for currency some types like us dollars are the same size color and texture regardless of their value which makes it impossible to distinguish a one dollar bill from 100 bill if i can't see the bill so our goal is to identify the value of the bill now for objects imagine that i can't see and i'm going into a room that i'm unfamiliar with i might want to know what's in that room what kind of furniture so i could understand what kind of room it is like a living room versus an office so with that in mind what did users tell us for currency it was fairly intuitive they said listen never confuse a one dollar bill for a 100 bill don't guess if you're not certain say nothing so in this case this is a pretty clear signal for precision now the easy thing to do is at first glance take that lesson and carry it over to objects but there's some nuance here imagine we have an object and it could be an arm chair or it could be a couch if we're very high precision we may say neither because we're not sure and from users we learned that actually either answer has some value even if it's imprecise so whether it's an armchair or it's a couch i know that this is a pretty big object and this room might be a living room so we actually want to balance between the two here so taking a big step back getting this kind of user feedback about development is critical do it during design during development during testing not after release it takes longer but you make a better product okay if you're excited to start writing your own accessibility apps with computer vision then dive into ml kit we have barcode scanning ocr and object detection and if you or someone you know is bundled low vision please consider lookout next i'm pleased to introduce my colleague sagar hi i'm sagar i work in machine perception team in google research my team's mission is to help machines understand the world like humans do we work on live transcribe and sound notifications live transcribe is an app for deaf and hard of hearing people to get captions for real-time conversations in over 80 different languages like transcribe is a great example of using ml technologies together with good ux research to provide meaningful experiences it uses automatic speech recognition of curse as well as models to detect speech versus other audio and also a sound event detection model to provide sound chips to the user i want to share the story of how live transcribe came about meet dmitry he's a computer science researcher who has been working on speech recognition for over three decades dmitry typically relied on lip reading and a professional caption for communications as good as he is with lip reading we would often struggle to have impromptu conversations around the water cooler liberating alone is only about 50 accurate and professional captioners are not always present as asr technology improved and became more pervasive dimitri launched an experiment with the team to create an app for real-time captioning that android app uses google's cloud speech-to-text api to turn that water cooler chat into captions on the screen of his phone together with dmitry's decades of experience in speech technology the team i traded on the app and improved it to a point where he started using it daily for personal and professional conversations dmitry was excited to try more things in the app motivated by his insights as well as enthusiasm we started looking at what else we could do to make the app better what if the app could interpret more than just spoken conversation a few years ago we released a data set called audio set which allows developers to recognize over 600 different sound event classes in their applications the easiest way to get started with audio set is to start with a pre-trained model from that data set for that you can directly integrate our open search pre-trained model called yamnet to understand different sound events here is a use case demonstrating how critical this non-speech information can be dimitri recalls a story of how one day he was asleep and the smoke alarm was active he could not hear it but luckily his neighbor came in and woke him up as our researchers talked to more potential users we heard countless stories like this and this led us to use haptic alerts vibrations delivered to your smartwatch to notify users about important sounds in their home so a few months ago we launched sound notifications it tells you when it hears sounds around your home like sirens dog parking babies crying water running or if somebody knocks on your door we also open source live transcribes android app engine so developers can customize it for their own use cases by integrating it within a bigger existing app or porting it to other platforms another such experiment of using ml for accessibility is project shuva shuba which means sign language in japanese is a project centered around sign language detection and understanding together with the nippon foundation and the chinese university of hong kong we created a web game for people to learn a bit of japanese and hong kong sign languages and since there are over 150 different sign languages in the world including american sign language indian sign language and many many more to better help developers create their own gesture and sign language understanding systems we have open sourced a gesture detection toolkit for developers you can check out demos of project shuba at the i o sandbox thank you for watching just like dimitri and our team's prototype live transcribe and its extensions with many existing bits and pieces we welcome developers to experiment with such different machine learning technologies to help the world become more accessible please reach out if you'd like to learn [Music] more you
Original Description
Explore three case studies covering Voice Access, Lookout, and Live Transcribe along with Sound Notifications. We also look at the intersection between Google’s machine learning research and Accessibility with takeaways for all Android developers.
Resources:
Voice Access → https://g.co/voiceaccess
Lookout → https://goo.gle/lookout
Google Developers ML Kit → https://goo.gle/3xyC3F6
Speakers: Tom Hume, Scott Adams, Sagar Savla
Watch more:
Google Developers at Google I/O 2021 Playlist → https://goo.gle/io21-GoogleDevelopers
All Google I/O 2021 Technical Sessions → https://goo.gle/io21-technicalsessions
All Google I/O 2021 Sessions → https://goo.gle/io21-allsessions
Subscribe to Google Developers → https://goo.gle/developers
#GoogleIO #Accessibility #ML/AI
product: Cloud - AI and Machine Learning - AI Platform; event: Google I/O 2021; fullname: Tom Hume, Scott Adams, Sagar Savla; re_ty: Premiere;
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Google for Developers · Google for Developers · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Developer Journey - Sunnyvale DSC Summit ‘19
Google for Developers
How Google is working with students - Sunnyvale DSC Summit ‘19
Google for Developers
Starting your career in the Cloud - Sunnyvale DSC Summit ‘19
Google for Developers
The Solution Challenge - Sunnyvale DSC Summit ‘19
Google for Developers
Firebase - Sunnyvale DSC Summit ‘19
Google for Developers
Cloud Hero - Sunnyvale DSC Summit ‘19
Google for Developers
Panel discussion - Sunnyvale DSC Summit ‘19
Google for Developers
The art of negotiation - Sunnyvale DSC Summit ‘19
Google for Developers
Courage to care, solve and share - Sunnyvale DSC Summit ‘19
Google for Developers
Version 9 of Angular, Glass Enterprise Edition 2, path to DX deprecation, & more!
Google for Developers
[DEPRECATING] Introducing a new series (Assistant for Developers Pro Tips)
Google for Developers
Detecting memory bugs with HWASan, Bazel 2.1, Next ‘20 session guide, & more!
Google for Developers
Why Podcast.app chose a .app domain name
Google for Developers
Machine Learning Bootcamp Jakarta 2019
Google for Developers
Android Studio 3.6, Android 11 Developer Preview, Kubeflow 1.0, & more!
Google for Developers
[DEPRECATING] Importance of community (Assistant on Air)
Google for Developers
Why the Flutter team switched from .io to a .dev domain name
Google for Developers
3 website-building tips from .dev creators
Google for Developers
Why NimbleDroid chose a .app domain name
Google for Developers
Android Platform Codelab, Bazel 2.2, Maps Android Utility Library v1.0, & more!
Google for Developers
Google for Games Developer Summit: A free, digital experience for game developers
Google for Developers
Inspecting Home Graph (Assistant for Developers Pro Tips)
Google for Developers
Google for Games Developer Summit Keynote
Google for Developers
Stadia Games & Entertainment presents: Keys to a great game pitch (Google Games Dev Summit)
Google for Developers
Empowering game developers with Stadia R&D (Google Games Dev Summit)
Google for Developers
Supercharging discoverability with Stadia (Google Games Dev Summit)
Google for Developers
Stadia Games & Entertainment presents: Creating for content creators (Google Games Dev Summit)
Google for Developers
Bringing Destiny to Stadia: A postmortem (Google Games Dev Summit)
Google for Developers
Live Captioning in Google Slides
Google for Developers
[DEPRECATING] User engagement for the Google Assistant
Google for Developers
TensorFlow Dev Summit ‘20, Google for Games Dev Summit, Cloud AI Platform Pipelines, & much more!
Google for Developers
Top 5 from the TensorFlow Dev Summit 2020
Google for Developers
Developer Student Clubs 2019 Turkey Leads Summit
Google for Developers
Building simpler payment experiences | Google Pay Plugin for Magento 2
Google for Developers
Become A Developer Student Club Lead
Google for Developers
Firebase Kotlin Extensions, ARM apps on the Android Emulator, Angular v9.1, & more!
Google for Developers
Test suite for Smart Home (Assistant for Developers Pro Tips)
Google for Developers
Google Play updates, Bazel 3.0, Business Console for Google Pay, & more!
Google for Developers
How to use error logs (Assistant for Developers Pro Tips)
Google for Developers
Contact Center AI, Android Studio 4.1 Canary 5, TensorFlow QAT API, & more!
Google for Developers
WebView DevTools, Kotlin meets gRPC, Flutter CodePen support, & more! (Episode 200)
Google for Developers
Offline handling for Smart Home (Assistant for Developers Pro Tips)
Google for Developers
Android 11 Dev Preview 3, Google Fonts for Flutter, Shielded VM, & more!
Google for Developers
Machine Learning Foundations: Ep #1 - What is ML?
Google for Developers
Flutter web support updates, BigQuery materialized views, Cloud Spanner emulator, & more!
Google for Developers
Computer vision by building a neural network with TensorFlow | Machine Learning Foundations
Google for Developers
Machine Learning Foundations: Ep #3 - Convolutions and pooling
Google for Developers
Android 11 Beta plans, Flutter 1.17, Dart 2.8, & much more!
Google for Developers
Machine Learning Foundations: Ep #4 - Coding with Convolutional Neural Networks
Google for Developers
Google Developers ML Summit
Google for Developers
Real-world image classification using convolutional neural networks | Machine Learning Foundations
Google for Developers
Adobe XD support for Flutter, Architecture Framework, temporary closures with Places API, & more!
Google for Developers
Machine Learning Foundations: Ep #6 - Convolutional cats and dogs
Google for Developers
Machine Learning Foundations: Ep #7 - Image augmentation and overfitting
Google for Developers
Announcing Firebase Live, Flutter Day, Java 11 on Google Cloud Functions, & more!
Google for Developers
Machine Learning Foundations: Ep #8 - Tokenization for Natural Language Processing
Google for Developers
Android 11 Beta, Google Play Asset Delivery, Firebase Crashlytics SDK, & much more!
Google for Developers
Natural Language Processing: Using sequencing APIs in TensorFlow | Machine Learning Foundations
Google for Developers
Build a sarcasm classifier using NLP and TensorFlow | Machine Learning Foundations
Google for Developers
AR Realism with the ARCore Depth API
Google for Developers
Related AI Lessons
⚡
⚡
⚡
⚡
Mastering TypeScript — Understanding the TypeScript Compiler (tsc) from Scratch — Lesson 2
Medium · JavaScript
Stop Overfitting With Basically One Line of Code
Medium · AI
Stop Overfitting With Basically One Line of Code
Medium · Machine Learning
Stop Overfitting With Basically One Line of Code
Medium · Data Science
🎓
Tutor Explanation
DeepCamp AI