Conferences, hearables, VoiceFirst

The Alexa Conference 2019: From Phase One to Phase Two

alexa conf 2019

A Meeting of the Minds

Last week, I made my annual trek to Chatanooga, Tennessee to gather with a wide variety of Voice technology enthusiasts at the Alexa Conference. Along with the seismic growth of smart speakers and voice assistant adoption, the attendees grew quite dramatically too, as we went from roughly 200 people last year to more than 600 people this year. We outgrew last year’s venue, the very endearing Chattanooga Public Library, and moved to the city’s Marriott convention center. The conference’s growth was accompanied with an exhibit hall and sponsorships from entities as large as Amazon itself. We even had a startup competition between five startups, where my guest, Larry Guterman, won the competition with his amazing Sonic Cloud technology.

In other words, this year felt indicative that the Alexa Conference took a huge step forward. Cheers to Bradley Metrock and his team for literally building this conference from scratch into what it has become today and for bringing the community together. That’s what makes this conference so cool; it has a very communal feel to it. My favorite part is just getting to know all the different attendees and understand what everyone is working on.

This Year’s Theme

Phase One

Bret Kinsella, the editor of the de-facto news source for all things Voice, Voicebot.ai, presented the idea that we’ve moved into phase two of the technology. Phase one of Voice was all about introducing the technology to the masses and then increasing adoption and overall access to the technology. You could argue that this phase started in 2011 when Siri was introduced, but the bulk of the progress of phase one was post-2014 when Amazon rolled out the first Echo and introduced Alexa.

consumer adoption chart
Smart Speaker Adoption Rate by Activate Research

Since then, we’ve seen Google enter into the arena in a very considerable way that has culminated into the recent announcement that it would have one billion devices with Google Assistant enabled. We also saw smart speaker sales soar to ultimately represent the fastest adoption of any consumer technology product ever. If the name of the game for phase one was introducing the technology and growing the user base, then I’d say mission accomplished. On to the next phase of Voice.

Phase Two

According to Bret, phase two is about a wider variety of access (new devices), new segments that smart assistants are moving into, and increasing the frequency in which people use the technology. This next phase will revolve around habituation and specialization.

voice assistant share
From Bret Kinsela’s Talk at the Alexa Conference 2019

In a lot of different ways, the car is the embodiment of phase two. The car already represents the second most highly accessed type of device behind only the smartphone, but offers a massive pool of untapped access points through integrations and newer model cars with smart assistants built into the car’s console. It’s a perfect environment for using a voice interface as we need to be hands and eyes-free while driving. Finally, from a habituation standpoint, the car, similar to smart speakers, will serve the same role of “training wheels” for people to get used to the technology as they build the habit.

There were a number of panelists in the breakout sessions and general attendees that helped open my eyes to some of the unique ways that education, healthcare, business, and hospitality (among other areas) are all going to yield interesting integrations and contributions during this second phase. All of these segments offer new areas for specialization and opportunities for people to increasingly build the habit and get comfortable using smart assistants.

The Communal Phase Two

Metaphorically speaking, this year’s show felt like a transition from phase one to phase two too. As I already mentioned, the conference itself grew up, but so have all of the companies and concepts that were first emerging last year. Last year, we saw the first Alexa-driven, interactive content companies like Select a Story and Tellables starting to surface, which helped shine a light on what the future of story-telling might look like in this new medium.

This year we had the founder of Atari, Nolan Bushnell, delivering a keynote talk on the projects he and his colleague, Zai Ortiz, are building at their company, X2 Games. One of the main projects, St. Noire, is an interactive, murder-mystery board game that fuses Netflix-quality video content for your character (through an app on a TV) with an interactive element for the players having to decide certain decisions (issued through a smart speaker). The players’ decisions are what will ultimately impact the trajectory of the game and allow for the players to progress far enough to solve the mystery.  It was a phenomenal demo of a product that certainly made me think, “wow, this interactive story-telling concept sure is maturing fast.”

Witlingo now has a serious product on its hands with Castlingo (micro-Alexa content generated by the user). I feel like while podcasts represent long-form content akin to blogging, there seems to be a gap to fill for more micro-form audio content creation more akin to tweeting. I’m not sure if this gap will ultimately be filled by something like Castlingo or Flash Briefings, but it would be awesome if a company like Witlingo emerged as the Twitter for audio.

Companies like Soundhound continue to give me hope that white-label assistant offerings will thrive in the future, especially as brands will want to extend their brands to their assistants, and not have something bland and generic. Katie McMahon‘s demos of Hound never cease to amaze me either, and it’s newest feature, Query Glue, displays the furthest level of conversational AI that I’ve seen to date.

Magic + Co’s presence at the show indicated that digital agencies are beginning to take Voice very seriously and will be at the forefront of the creative ways brands and retailers integrate and use smart assistants and VUI. We also had folks from Vayner Media at this year’s conference which was just another example that some of the most cutting-edge agencies are thinking deeply about Voice.

Finally, there seemed to be transition to a higher phase on an individual level too. Brian Roemmele, the man who coined the term #VoiceFirst, continues to peel back the curtain on what he believes the long-term future of Voice looks like (check out his podcast interview with Bret Kinsella). Teri Fisher seemed to be on just about every panel and was teaching everyone how to produce different types of audio content. For example, he provided a workshop on how to create a Flash Briefing, which makes me believe we’ll see a lot of people from the show begin making their own audio content (myself included!).

the role of hearables
From my presentation at the Alexa Conference 2019

From a personal standpoint, I guess I’ve entered into my own phase two as well. Last year I attended the conference on a hunch that this technology would eventually impact my company and the industry I work in, and after realizing my hunch was right, I decided that I needed to start contributing in the area of expertise that I know best: hearables.

This year, I was really fortunate to have the opportunity to present on the research I’ve been compiling and writing about around why I believe hearables play a critical role in a VoiceFirst future. I went from sitting in a chair, watching and admiring people like Brian, Bret and Katie McMahon share their expertise last year, to being able to share some of my own knowledge this year to those same people, which was one of the coolest moments in my professional career. (Stay tuned, as I will be releasing my 45-minute talk into a series of blog posts where I break down each aspect of my presentation.)

For those of you reading this piece who haven’t been able to make this show but feel like this conference might be valuable but aren’t sure how, my advice to you is to just go. You’ll be amazed at how inclusive and communal the vibe is and I bet you’ll even walk away from it thinking differently about you and your business’ role as we enter into the 2020’s. If you do decide to go, be sure to reach out as I will certainly be in attendance next year and the years beyond.

-Thanks for Reading-

Dave

 

 

Conferences, hearables, Smart assistants, VoiceFirst

The State of Smart Assistants + Healthcare

 

Last week, I was fortunate to travel to Boston to attend the Voice of Healthcare Summit at Harvard Medical School. My motivation for attending this conference was to better understand how smart assistants are currently being implemented into the various segments of our healthcare system and to learn what’s on the horizon in the coming years. If you’ve been following my blog or twitter feed, then you’ll know that I am envisioning a near-term future where smart assistants become integrated into our in-the-ear devices (both hearables and bluetooth hearing aids). Once that integration becomes commonplace, I imagine that we’ll see a number of really interesting and unique health-specific use cases that leverage the combination of the smartphone, sensors embedded on the in-the-ear device, and smart assistants.

 

Bradley Metrock, Matt Cybulsky and the rest of the summit team that put on this event truly knocked it out of the park, as the speaker set and the attendees included a wide array of different backgrounds and perspectives, which resulted in some very interesting talks and discussions. Based on what I gathered from the summit, smart assistants will yield different types of value to three groups: patients, remote caregivers, and clinicians and their staff.

Patients

At this point in time, none of our mainstream smart assistants are HIPAA-compliant, limiting the types of skills and actions that be developed specific to healthcare. Companies like Orbita are working around this limitation by essentially taking the same building blocks required to create of voice skills and then building secure voice skills from scratch in its platform. Developers who want to create skills/actions for Alexa or Google that use HIPAA data, however, will have to wait until the smart assistant platforms have become HIPAA-compliant, which could happen this year or next.

It’s easy to imagine the upside that will come with HIPAA-compliant assistants, as that would allow for the smart assistant to retrieve one’s medical data. If I had a chronic condition that required me to take five separate medications, Alexa could audibly remind me to take each of the five, by name, and respond to any questions I might have regarding any of the five medications. If I am telling Alexa of a side effect I’m having, Alexa might even be able to identify which of the five medications are possibly causing that side-effect and loop in my physician for her input. As Brian Roemmele has pointed out repeatedly, the future ahead for our smart assistants is routed through each of our own personalized, contextual information, and until these assistants are HIPAA-compliant, the assistant has to operate at a more general level than a personalized one.

That’s not to say there isn’t value in generalized skills or skills that don’t use data that falls under the HIPAA umbrella and therefore can be personalized. Devin Nadar from Boston Children’s Hospital walked us through their KidsMD skill, which ultimately allows for parents to ask general questions about their children’s illness, recovery, symptoms, etc and then have the peace of mind that the answers they’re receiving are being sourced and vetted by Boston Children’s Hospital; it’s not just random responses being retrieved from the internet. Cigna’s Rowena Track showed how their skill allows for you to check things such as your HSA-balance or urgent care wait times.

Care Givers and “Care Assistants”

By 2029, 18% of America will be above the age of 65 years old and the average US life expectancy rate is already climbing above 80. That number will likely continue to climb which brings us to the question, “how are we going to take care of our aging population?”  As Laurie Orlov, industry analyst and writer of the popular Aging In Place blog, so eloquently stated during her talk, “The beneficiaries of smart assistants will be disabled and elderly people…and everyone else.” So, based on that sentiment and the fact that the demand to support our aging population is rising, enter into the equation what John Loughnane of CCA described as, “care assistants.”

Triangulation Pic.jpg
From Laurie Orlov’s “Technology for Older Adults: 2018 Voice First — What’s Now and Next” Presentation at the VOH Summit 2018

As Laurie’s slide above illustrates, smart assistants or “care assistants” in this scenario, help to triangulate the relationship between the doctor, the patient and those who are taking care of the patient, whether that be care givers or family. These “care assistants” can effectively be programmed with helpful responses around medication cadence, what the patient can or can’t do and for how long they’re restricted, what they can eat, when to change bandages and how to do so. In essence, the “care assistant” serves as an extension to the care giver and the trust they provide, allowing for more self-sufficiency and therefore, less of a burden on the care giver.

As I have written about before, the beauty of smart assistants is that even today in their infancy and primitive state, smart assistants can empower disabled and elderly people in ways that no previous interface has before. This matters from a fiscal standpoint too, as Nate Treloar, President of Orbita, pointed out that social isolation costs Medicare $6.7 billion per year. Smart assistants act as a tether to our collective social fabric for these groups and multiple doctors at the summit cited disabled or elderly patients who described their experience of using a smart assistant as “life changing.” What might seem trivial to you or I, like being able to send a message with your voice, might be truly groundbreaking to someone who has never had that type of control.

The Clinician and the System

The last group that stands to gain from this integration would be the doctor and those working in the healthcare system. According to the annals of Internal Medicine, for every hour that a physician spends with a patient, they must spend two hours on related administration work. That’s terribly inefficient and something that I’m sure drives physicians insane. The drudgery of clerical work seems to be ripe for smart assistants to provide efficiencies. Dictating notes, being able to quickly retrieve past medical information, share said medical information across systems, etc. Less time doing clerical work and more time helping people.

Boston Children’s Hospital uses an internal system called ALICE and by layering voice onto this system, admins, nurses and other staff can very quickly retrieve vital information such as:

  • “Who is the respiratory therapist for bed 5?”
  • “Which beds are free on the unit?”
  • “What’s the phone number of the MSICU Pharmacist?”
  • “Who is the Neuro-surgery attending?”

And boom, you quickly get the answer to any of these. That’s removing friction in a setting where time might really be of the essence. As Dr. Teri Fisher, host of the VoiceFirst Health podcast, pointed out during his presentation, our smart assistants can be used to reduce the strain on the overall system by playing the role of triage nurse, admin assistant, healthcare guide and so on.

 What Lies Ahead

It’s always important with smart assistants and Voice to simultaneously temper current expectations while remaining optimistic about the future. Jeff Bezos joked in 2016 that, “not only are we in the first inning of this technology, we might even be at the first batter.” It’s early, but as Bret Kinsela of VoiceBot displayed during his talk, smart speakers represent the fastest adoption of any consumer technology product ever:

Fastest Adoption
From Bret Kinsela’s “Voice Assistant Market Adoption” presentation at the VOH Summit 2018

The same goes for how smart assistants are being integrated into our healthcare system. Much like Bezos’ joke, very little of this is even HIPAA-compliant yet. With that being said, you still have companies and hospitals the size Cigna and Boston Children’s Hospital putting forth resources to start building out their offerings in an impending VoiceFirst world. We might not be able to offer true, personalized engagement with the assistant yet, but there’s still lots of value that can be derived at the general level.

As this space matures, so too will the level of which we can unlock efficiencies within our healthcare system across the board. Patients of all ages and medical conditions will be more empowered to receive information, prompts and reminders to better manage their conditions. This means that those taking care of the patients are less burdened too, as they can offload the information aspect of their care giving to the “care assistant.” This then frees up the system as a whole, as there are less general inquiries (and down the line, personal inquiries), meaning less patients who need to come in and can be served at home. Finally, the clinicians can be more efficient too, as they can offload clerical work to the assistant and better retrieve data and information on a patient-to-patient basis, and also more efficiently communicate with their patient, even remotely.

As smart assistants become more integral to our healthcare system, my belief is that on-body access to the assistant will be desired. Patients, caregivers, clinicians and medical staff all have their own reasons for wanting their assistant right there with them at all times. What better a place than a discreet, in-the-ear device that allows for one-to-one communication with the assistant?

-Thanks for Reading-

Dave

Conferences, hearables, Smart assistants, VoiceFirst

The Alexa Conference Blew my Mind

Alexa Conf Icon

Last Thursday, I was fortunate for the opportunity to travel to Chattanooga, TN to attend the second annual Alexa Conference and join a group of some of the smartest people working on Voice technology.  The cool thing about the Alexa Conference is that it’s not sponsored by Amazon (or Google or any other major tech company), it’s fully-independent, sponsored by third parties, and therefore it truly feels objective and unbiased. The attendees and speakers ranged from third party “skill” agencies, skill developers (domestic and international), certified Alexa champions, skill analytic and diagnostic providers, a representative from the FTC, insurance and healthcare reps, to futurists, Internet of Things specialists, digital transformation experts, behavioral economists, doctors, PhD scientists, former NASA employees, and a random dude from the Audiology industry who writes a blog called FuturEar.

I have been following the #VoiceFirst movement, which includes progress in the Voice User Interface (VoiceUI), the devices that house our smart assistants (smart speakers, smartphones and wearable technology), devices that work in conjunction and respond  to smart assistants (Internet of Things), and our smart assistants as a whole for the past few years. I think I may have learned more in the 48 hours that I attended this conference than I have in the thousands of hours leading up to it. Ok, that’s probably some hyperbole there, but there was a ton of insight and these were my favorite takeaways from the show:

Context is King

One of the big questions that I had heading into Chattanooga was, “how do we take this all to the next level?” I now have the answer and it all derives from context. Deep, personalized contextual awareness. What does that mean? Well, for starters, let’s establish that smart assistants feed and grow stronger on “personal data.” The only way that these assistants ever get any more useful or “smarter” is by learning more about us.

Brad Metrock and Brian Roemmele
Bradley Metrock interviewing Brian Roemmele

A really good way to think about this is through the lens of ordering your favorite pizza. My favorite pizza (shoutout Imo’s) is probably pretty different than your favorite pizza. The web, as we know it, is built on a pay-per-click model, so when I search on Google for pizza options around me, the results are going to show advertised options at the top. These are not in any way personalized to me and therein lies the big difference. When I’ve ordered pizza 20 times through my smart assistant, 15 of which have been Imo’s, and then I’m in Chattanooga for work (where Imo’s does not exist) and I want to order a pizza, my smart assistant will provide me results similar to Imo’s in Chattanooga. The smart assistant knows my preferences and therefore will actively distill the options for me to cater to my personal preferences.

Taking it one step further, think about all the other personal information that you probably share or are having shared with you that can broaden the assistant’s contextual awareness. If your friends have been to Chattanooga and they raved about a pizza spot on Instagram months or years ago, your smart assistant could retrieve that and factor that into your results. So now it’s not just based on your own pizza preferences, but also factoring in other variables such as your friends’ experiences and preferences.

169 Labs
Dominic Meissner and Tim Kahle of 169 Labs

This begins to bring privacy and security front and center. One of the really interesting presentations was from the German guys at 169Labs. While the attitude in the US around privacy is pretty lax and apathetic, it’s important to understand that our attitude here in the States is quite different than how many Europeans feel. They take their privacy way more seriously and it’s a top of mind issue that permeates in any tech discussion. Privacy will continue to be a topic of discussion as our smart assistants evolve and we become increasingly more aware of just how much data we are sharing. I believe the pros outweigh the cons when it comes to sharing your personal data with your smart assistant(s), but the key is going to be feeling safe that it is all encrypted and protected from being hacked.

The beginnings of Conversational Interfaces

One of the more frustrating aspects of smart speakers and smart assistants is the lack of continuity. Currently, our smart assistants function in a way that is more or less executing single commands or single questions. There isn’t really any dialogue, it’s typically, “Alexa shuffle my playlist” or “Alexa set a timer for 15 minutes” or “Alexa what’s the weather?” or “Alexa how many ounces are in a gallon?” Asking a question or issuing a command and having the device turn off afterward is not the goal for our smart assistants. Brian Roemmele compared this level of sophistication to the command line in the PC era. It’s super primitive and we’re in the first inning of a double-header with this technology.

Instead, what we need is, again, contextual awareness in order to have a dialogue. Katie McMahon of Soundhound did an awesome job demoing Soundhound’s own smart assistant, Hound, with some real contextual awareness:

So she starts off by saying, “Show me Asian restaurants, excluding Japanese and Chinese, that are open right now.” An accomplishment in itself that Hound so quickly answered accurately. Then she goes on to further refine the search, “Ok Hound, show those with outdoor seating.” The key word there is, “those,”  as the assistant is now aware of the context because it recognizes that “those” is a demonstrative pronoun representing the Asian restaurants from the previous query. This is HUGE! In a dialogue with another person, you’re constantly using pronouns and language that references context from earlier in the conversation or conversations prior. It’s an essential part to how we communicate and we’re already seeing with smart assistants like Hound demonstrate that these assistants are more than capable of this type of complex contextual awareness. Without this ability, I doubt smart assistants will ever be taken that seriously.

Next, she goes one step further. Katie says, “I need an Uber to the first one.” So not only does the assistant recognize that “the first one” is in reference to the first result from the previous search, but its capable of using multiple “domains” or skills in conjunction. This is a significant step forward on something that we’re pretty limited with today. How many people would have been able to accomplish that all in one app on their phone? Most likely, you’d use multiple apps like Yelp, grab the address of where you want to go and pop it in Uber. Furthermore, if your assistant is factoring in more data for a more personalized result list, by retrieving your friends’ Instagram and/or Swarm data, then we’re comparing that to you going into each app and hunting for the relevant posts. This is clearly an improvement in time and efficiency.

It’s honestly amazing what’s going on over at Hound and the ability its assistant has with retaining information throughout the dialogue. It was the first time I really saw this level of continuity and contextual awareness in a smart assistant and it made me very optimistic about the potential of smart assistants when they’re able to have this type of dialogue. When I wrote about the key to the adoption of a new user interface being the reduction of friction, this is ultimately what I was referring to. Even a primitive conversational interface would still dramatically reduce the time that we currently spend app toggling, tapping and searching on our phones for many things. We’re moving far, far beyond just using assistants to set timers.

(Here’s a link to another video of Katie asking Hound for hypothetical monthly mortgage payments based on home values, interest rates and down payment percentages. I was blown away by these demos and have been using Hound since I’ve gotten back. It’s encouraging to see that all of this innovation is not limited to just Amazon, Google, Facebook and Apple.)

Proactive & Personalized

Another huge advantage of feeding your smart assistant personal data is that it can begin to proactively engage you on what it’s finding. This is where the Internet of Things (IoT) begins to get really interesting. If you have a smart fridge and your smart assistant is accessing that data, it then knows stuff like your fruit being spoiled. Couple that with the possibility that you’ve given your assistant access to all of your calendar data, so it knows that you have a dinner party that night and you’ve, in some way or another, logged that you need the fruit for your recipe for said dinner party. So, what we’re moving toward is a scenario where the smart assistant in your car or hearable pings you and says something along the lines of, “Hey Dave, it looks like your blackberries have gone bad and you need them for your dessert recipe for Julie’s dinner party tonight. It looks like they’re currently on sale at Mariano’s down the street, would you like to go?” and then navigate you down to grocery store.

This was a big aha! moment for me. So much of #VoiceFirst is happening in disparate areas that it’s hard to bring all of it together into one specific use case like I just mentioned above. When they’re silo’d off on their own, you hear, “smart fridge” and you think, “Really? What’s the point of that?” But when you start looking at all of these IoT devices as data entry points for your smart assistant, which your assistant can then actively retrieve, assess and then provide actionable insight (all very quickly), it all then becomes a whole lot more compelling. This is, “small data” as opposed to big data. It’s personal to each of us and therefore invaluable to each of us. This opens the door to serendipity and assistants proactively providing you with suggestions and reminders from data that you likely aren’t even aware of.

Some other Takeaways

  • Brian Roemmele harped a lot on the idea of  “enabling creatives to enter into the fold.” He used the analogy of Steve Jobs empowering the graphic designers with the iPhone, as the iPhone created a massive abundance of opportunity for that profession. The same will be done with voice for many more creative types that include comedians, poets, psychologists, storytellers, artists, historians, writers, etc. Therefore we need to a set of tools that are easy enough for anyone to use and create with.
  • VoiceXP demonstrated a number of unique skills specifically for the Echo Show.  I also appreciated that Bob Stolzberg really emphasized the fact that the Echo Show is version one of Amazon’s multi-modal strategy. We’re quite literally scratching the surface here with what’s possible when you add in screens and mixed-modality into the #VoiceFirst equation. Some really exciting opportunities around this.
VoiceXp.JPG
Mark Tucker and Bob Stolzberg of VoiceXP
  • Keynote speaker Ahmed Bouzid presented a plethora of fascinating facts and charts, but the one that stood out to me were two of the demographics that can benefit the most from a #VoiceFirst world: seniors and physically incapacitated. This is at the heart of why I’m so passionate about spreading awareness to the #audpeeps, audiology and hearing aid industry about #VoiceFirst. Smart assistant integration is coming to hearing aids and this new use case for hearing aids, hearables and all our ear-computers, stands to really benefit those who struggle in a mobile world. Mobile computing is not conducive to these two demographics and I’m ecstatic about the possibility that these demographics will soon be empowered in a way they never have before. It’s an awesome value-add that you can advocate for free and it will dramatically improve the patient experience over time.

This was an amazing two days and I’ll definitely be back next year to continue to gather everything I can about what’s happening in the #VoiceFirst world and how they’ll ultimately impact those of you who work with the little computers that go in the ear.

-Thanks for reading-

Dave