Google’s Accessibility Initiatives
Google held it’s annual developer conference yesterday, Google I/O, and had a flurry of announcements that I will touch upon in later updates. Today, I want to focus on two of the initiatives Google announced around accessibility.
Google has introduced a new feature in its new Android operating system, Q, called Live Caption (I think they were interchanging the name “Live Relay” too). The feature is due out later this year and according to CEO, Sundar Pitchi, “Live Caption makes all content, no matter it’s origin, more accessible to everyone. You can turn on captions for a web video, podcast, even on a video shot on your phone.”
Being able to caption virtually any video on an Android phone that’s running Q will be hugely valuable to the Deaf and hard-of-hearing community. It’s also really convenient for anyone that is in an environment where they want to watch a video and not play the audio. A shout out to KR Liu for her cameo in the video and her collaboration with Google in bringing this feature to life! She and the folks at Doppler Labs were pioneers in the hearables space, and it should come as no surprise when the Doppler alumni pop up here and there with contributions like this. Amazing stuff.
Project Euphonia is another initiative for Google to use its machine learning technology to help train its speech recognition systems for people who have speech impairments. Google is training this particular speech recognition model by people who have had strokes, Multiple Sclerosis, stutters, or any other impairment such as the individual in the video, Dimitri Kanevsky, who is a research scientist at Google and has a speech impairment of his own.
Dimitri alone has recorded 15,000 phrases to help train the model to better understand speech that isn’t traditionally being inputted into the training model. According to Dimitri, his goal and Google’s with Euphonia, is to, “make all voice interactive devices be able to understand any person speaking to it.” This is really important work as it will be crucial to ensure that the #VoiceFirst world we’re trending toward is as inclusive to as many people as possible.
In addition, this project is aiming to bring those who cannot speak into the fold as well. Creating models that can be trained by those with ALS themselves, to recognize facial cues or non-speech utterances (like grunts and hums), which then trigger sounds from companion computers, such as a cheer or a boo. As Dmitri points out, to understand and be understood is absolutely unbelievable.
This is tech for good. Apple’s been doing a lot of great work around accessibility too, and in light of all the tech-backlash, if these companies want to compete for positive PR by re-purposing their technology to empower those who need it most…well, then that’s fine by me!
-Thanks for Reading-