hearables, Smart assistants, VoiceFirst

10 Years after the App Store, what’s Next?

What’s Next?

As we celebrate the 10 year anniversary of the App store this week, it seems only natural that we begin wondering what the next 10 years will look like. What modalities, devices, interfaces and platforms will rise to the top of our collective preferences? There’s clearly an abundance of buzzwords that are thrown around these days that indicate a potential direction things may go, but the area that I want to focus on is the Voice interface. This includes smart assistants and all the devices they’re housed in.

Gartner’s L2 recently published the chart below, which might seem to pour some water on the momentum that has been touted around the whole Voice movement:

Before I go into why this chart probably doesn’t matter in the grand scheme of things, there were some solid responses as to why these trend lines are so disparate. Here’s what Katie McMahon, the VP of SoundHound, had to say:

Katie McMahon Tweet

One of the primary reasons the app economy took off was due to two-sided network effects predicated on developer buy-in based on huge monetary incentive. Of course there was an explosion of new applications and things you could do with your smartphone, as there was a ton of money to be made to develop those apps. This was a modern day gold rush. The same financial incentive around developing voice skills doesn’t yet exist.

Here’s a good point Chris Geison, senior Voice UX researcher at AnswerLab, made around one of the current, major pitfalls of Voice skill/action discovery:

Chris Geison Tweet

So, based on Chris and AnswerLabs’ research, a third of users don’t really know that an “app-like economy” exists for their smart speakers. That’s rather startling, given that it was reported by Voicebot at the end of June that there are now 50 million smart speaker users in the US. Is it really possible that tens of millions of people don’t fully understand the capabilities and the companion ecosystem that comes with the smart speaker that they own? It would appear so, as the majority of users are using their smart speakers for native functionality that doesn’t require a downloaded skill as illustrated by this awesome chart from Voicebot’s Smart Speaker Consumer Adoption Report 2018:

As you can see from the chart above, only 46.5% of respondents from this survey have used a skill/action.

Jobs to be Done

In order to understand how we move forward and what’s necessary to do so, it’s important to look at how we use our phones today. As I wrote about in a previous post, each computer interface evolution has been a progression of reducing friction, or time spent doing a mechanical task. Today’s dominant consumer interface – mobile –  is interfaced with Apps. Apps represent jobs that need doing, whether that be a tool to get us from A to B (maps), filling time when you’re bored (games/social media/video), exercising or relaxing the mind(sudoku/chess/books/music/podcasts), etc. Every single app on your phone is a tool for you to execute the job you’re trying to accomplish.

User Interface Shift
From Brian Roemmele’s Read Multiplex 9/27/17

So, if we’re looking to reduce friction as we enter into a new era of computing interaction, we should note that the majority of friction with mobile is primarily consolidated around the mechanical process of pulling out your phone, digging through and toggling between your apps to achieve the job needing to be done. That mechanical process is the friction that needs to be removed.

Workflow + Siri Shortcuts

I was initially underwhelmed by Apple’s WWDC this year because I felt that Siri had once again been relegated to the backseat of Apple’s agenda, which would be increasingly negligent given how aggressive Amazon, Google and the others have been moving into this area. What I didn’t fully understand was how crucial Apple’s Workflow acquisition was back in 2017 and how it might apply to Siri.

Siri Shortcuts ultimately represent a way in which users can program “shortcuts” between apps, so that they can execute a string of commands together into a “workflow” via a voice command. The real beauty of this is that each shortcut can be made public (hello, developers) and Siri will proactively suggest shortcuts for you based on what Siri learns about your preferences and contextual behavior. Power-users empowering mainstream-users with their shortcuts, as suggested by Siri. Remember, context is king with our smart assistants.

Brian Roemmele expanded on this acquisition and the announced integration of Workflow with Siri on Rene Ritchie’s Vector podcast this week. Brian said something in this podcast that really jumped out at me (~38 min mark):

“Imagine every single app on the app store. Now deconstruct all those apps into Jobs to be Done, or intents, or taxonomies. And then imagine, with something like crayons, you can start connecting these things artistically any way you want… Imagine you can do that without mechanically doing it.”

This cuts right to the core of what I think the foreseeable future looks like. Siri Shortcuts powered by Workflow take the role of those crayons. If we’ve extracted out all the utility and jobs that each app represents and put them together into one big pile, we can start to combine various elements of different apps to result in increased efficiencies. This to me really screams “removing mechanical friction.” When I can speak one command and have my smart assistant knock out the work I’m currently doing when I’m digging, tapping and toggling through my apps, that’s significant increases in efficiency

  • “Start my morning routine” – starts my morning playlist, compares Lyft and Uber and displays the cheaper (or quicker, depending on what I prefer) commute, orders my coffee from Starbucks, and queue’s up three stories I want to read on my way to work.
  • “When’s a good time to go to DC” – pulls together things like airfare, AirBnB listings, events that might be going on at the time like concerts or sports games surfaced from Ticketmaster/SeatGeek/Songkick, weather trends, etc.

The options are up to one’s imagination and this interface really does begin to resemble a conversational dialogue as the jobs that need to be done become increasingly more self-programmed by the smart assistant over time.

All Together Now

Apple isn’t the only one deploying this strategy; Google’s developer conference featured a strikingly similar approach to unbundling apps called Slices and App Actions. It would appear that the theme here heading into the next 10 years is to find ways to create efficiencies by leveraging our smart assistants to do the grunt work for us. Amazon’s skill ecosystem is currently plagued by discovery issues as highlighted above, but the recent deployment of CanFulfillIntentRequest for developers will hopefully allow for easier discovering of skills and functionality for mainstream users. The hope is that all the new voice skills and the jobs that they do can be surfaced much more proactively. That’s why I don’t fixate on the amount of skills created to this point, because the way in which we effectively access those skills hasn’t really matured yet.

What’s totally uncertain is whether the companies that sit behind the assistants will play nice with each other. In an ideal world, our assistants would specialize in their own domains and work together. It would be nice to be able to use Siri on my phone, which would work with Alexa when I’m needing something from the Amazon empire or control an IoT-Alexa based device. It would be great if Siri and Google Assistant communicated in the background so that all my gmail and calendar context was available for Siri to access.

Access Point

It’s possible that we’ll continue to have “silos” of skills and apps, and therefore silos of contextual data, if the platforms aren’t playing nice together. Regardless, within each platform the great unbundling seems to be underway. As we move towards a truly conversational interface where we’re conversing with our assistants to accomplish our jobs to be done, we then should think about where we’re accessing the assistant.

I’m of the mind that as we depend on our smart assistants more and more, we’ll want access to our assistants at all times. Therefore, I believe that we’ll engage with smart assistants across multiple different devices, but with continuity, all throughout the day. I may be conversing with my assistants in my home via smart speakers or IoT devices, in my car on the way to work, and in my smart-assistant integrated hearables or hearing aids throughout the course of my day while I’m on-the-go.

While the past 10 years was all about consolidating and porting the web to our mobile devices via apps, the next 10 might be about unlocking new efficiencies and further reducing friction by unbundling the apps and allowing our smart assistants to operate in the background doing our grunt work and accomplishing for us the jobs we need done. It’s not as if smartphones and tablets are going to go away, on the contrary, but its how we use them and derive utility from them that will fundamentally change.

-Thanks for Reading-


hearables, Smart assistants, VoiceFirst

The Unexpected #VoiceFirst Power Users

Smart Assistant Power Users

In-The-Ear Assistants

There was a really good post published last week in the #VoiceFirst world by Cathy Pearl, the author of the book, “Designing Voice User Interfaces.” In her post, she goes through  some of the positive effects that smart assistants are having on the disabled and elderly communities. The unique and awesome thing about the Voice user interface is it enables these demographic groups that had previously been left behind by past user interfaces. Due to physical limitations or the deterioration of one’s senses and dexterity, mobile computing (and all prior generations of computing) is not very conducive to these groups of people. Additionally,  Voice is being adopted by all ages, from young children to elderly folks, in large part due to the fact that there is virtually no learning curve. “Just tell Alexa what you want her to do.”

Cathy’s article dovetails nicely into what I see as being the single biggest value-add that hearing aids and hearables have yet to offer – smart assistant integration. As I wrote about back in January, one of the most exciting announcements this year was Amazon’s Mobile Accessory Kit (AMAK). This software kit makes it dramatically easier for OEMs, such as hearable and hearing aid manufacturers, to integrate Alexa into their devices.

(I should note that, as of now, “integration” represents a pass-through connection from the phone to the audio device. In the future, as our mini ear-computers become more independent from our phones, so too should we see full smart assistant integration as our audio devices further mature and become more capable as standalone devices.)

AMAK will help accelerate the smart assistant integration that’s already taking place in the hearables market, which now includes Airpods (Siri), Bose QC35 (Google Assistant), Bragi Dash Pro (Siri/Google/Alexa), Jabra Elite 65t (Siri/Google/Alexa), NuHeara IQ Buds (Siri/Google) and a handful of others. Hearing aids will soon see this type of integration too. Starkey CTO, Achin Bhowmik, alluded to being able to activate and engage smart assistants with taps on the hearing aids, verbal cues and head gestures. Given the partnerships between hearing aid and hearable companies (i.e. Starkey and Bragi) or full-on acquisitions (i.e. GN Resound owning Jabra), it seems that we’ll see this integration with all of our new “connected” hearing aids too.

A Convergence of Needs

Convergence Of Needs

For our aging population, there’s a convergence of needs that tends to exist. For starters, one out of every three US adults 65+ years old has a certain degree of hearing loss. Add in the fact that dating back to January 2011, 10,000 baby boomers turn 65 every day. By 2029, 18% of America will be above the age of 65 years old. Our population is living longer, the baby boomers are all surpassing the age of 65, and we’re all being exposed to levels of sound pollution not yet seen before. Mix that all together and we’re looking at increasing number of people who could benefit from a hearing aid.

Next, it’s important to consider what happens to our day-to-day tasks that we depend on technology for when a new interface arrives. I mentioned this in a previous post, in which I wrote:

“Just as we unloaded our various tasks from PCs to mobile phones and apps, so too will we unload more and more of what we currently depend on our phones for, to our smart assistants. This shift from typing to talking implies that as we increase our dependency on our smart assistants, so too will we increase our demand for an always-available assistant(s).”

My point was that just about everything you now depend on your phone for – messaging, maps, social media, email, ordering food, ridesharing, checking weather/ stock prices/ scores/ fantasy sports/ etc – will likely manifest itself in some way via Voice. This is a big deal in general, but for our aging and disabled population, this can be truly life-changing.

That was the aha! moment for me reading Cathy’s post. The value proposition for smart assistants is much more compelling at this point for these communities of people compared to someone like myself who has no problem computing via a smartphone. I certainly enjoy using my Alexa devices, and in some instances it might cut down on friction, but there’s nothing that it currently offers that I can’t otherwise do on my phone.

It’s similar to why mobile banking is growing like crazy in places like Kenya and India. For a large portion of people in those countries, there is no legacy, incumbent system in place in which people need to migrate from, unlike here in the US where the vast majority of people have traditional bank accounts. Along the same vein, many elderly people and those with physical limitations would not be migrating from existing systems, but rather adopting a new system from scratch that yields entirely new value.

If I’m already a hearing aid candidate or considering a hearable, smart assistant integration makes owning this type of device that much more compelling. Even in its current crude, primitive state, smart assistants provide brand new functionality and value for those that struggle to use a smartphone. There’s an unmet need in these communities to connect and empower oneself via the internet and smart assistants supply a solution.

The Use Cases of Today

Old Lady and Alexa.jpg

Building off this idea that we’re just shifting tasks to a new interface, let’s consider messaging. As Cathy highlighted in her post, we’re already seeing some really cool use cases being deployed by assisted living facilities like Front Porch in California, where the facility is outfitting residents with Amazon Echos. The infrastructure is being built out to facilitate audio messaging between residents, staff and residents’ families.

Taking it one step further, if the resident has their smart assistant integrated in their hearing aid, they can seamlessly communicate with fellow residents, staff members and their family members anywhere they want in the assisted living facility. Not to mention being able to actually hear and understand the assistant responding since it’s housed directly in the ear. Whereas I prefer to text, audio messaging mediated by Alexa or Siri provides a much more conducive messaging system for these groups.

The Alexa skill, My Life Story, is built specifically for those suffering from Alzheimer’s. It allows for the user’s family members to program “memories” for their loved one, so that Alexa reads back the memories to help trigger their memory. Again, putting this directly in the hearing aid allows for this type of functionality to exist anywhere the user is with their hearing aid, empowering them to be more mobile while remaining tethered to something they may become dependent on. (Reminds me of this scene from the movie “50 First Dates.”)

Another great example of how smart assistants can provide a level of independence for the user is this story describing how a stroke victim uses smart assistants. The victim’s family created routines, such as saying “Alexa, good morning” which triggers the connected devices in her room to open the blinds, turn the lights to 50%, and turn on the TV. “Alexa, use the bathroom” turns her room’s lights yellow to notify the staff that she needs to use the bathroom. So, while connected light bulbs and TVs might seem excessive or unnecessary for you or I, they serve as tools to help restore another’s dignity.

These are just a few specific use cases among many tailored to these communities that tie in with the more broad ones that already exist for the masses, such as ordering an Uber, streaming audio content, checking into flights, answering questions, checking the weather, or the other 30,000+ skills that can be accessed via Alexa.

The Use Cases of Tomorrow

We’re already seeing network effects broadly take hold with smart assistants, and I think its fair to say that we’ll see pockets of network effects within specific segments of the total user base too. If there are disproportionately high numbers of users or the engagement levels are higher within a certain segment of the user base, you can expect software developers to migrate toward creating more functionality for these pockets of users. There’s more money and incentive to cater to the power users.

Where the software development will get really interesting is when the accompanying hardware matures too. In this instance, the hearing aids and hearables. CNET’s Roger Cheng and Shara Tibken dove into what a more technologically mature hearing aid might look like with Starkey’s Achin Bhowmik. In this excerpt, Bhowmik describes the hearing aid’s transformation into a multipurpose device:

“Using AI to turn these into real-time health monitoring devices is an enormous opportunity for the hearing aid to transform itself,” he says.

By the end of this year, Starkey also will be bringing natural language translation capabilities to its hearing aids. And it plans to eventually integrate optical sensing to detect heart rate and oxygen saturation levels in the blood. It’s also looking at ways to noninvasively monitor the glucose levels in a user’s blood and include a thermometer to detect body temperature.

So, the hardware will supply a whole host of new capabilities rife with opportunities for developers to overlay smart assistant functionality on top of. Going back to the idea of there being convergence of needs – if I am 80 years old, have hearing loss, diabetes, and dexterity issues – a hearing aid that provides amplification, monitors glucose levels, and houses a smart assistant that interprets those glucose readings for me and gives me the functionality I currently derive from my iPhone, then that’s a very compelling device.

A single device that serve multiple roles and meets a number of unmet needs simultaneously. Empower these communities with something like that and these groups of people are going to adopt smart assistants en masse. Finally, an all-inclusive tool to connect those on the sidelines to the digital age.

-Thanks for Reading-


News Updates, Smart assistants, VoiceFirst

Smart Assistants Continue to Incrementally Improve


This week we saw a few developments in the #VoiceFirst space that might seem trivial on the surface, but are landmark improvements toward more fully-functional voice assistants. The first was from Amazon with the introduction of “Follow-up mode.” As you can see from Dr. Ahmed Bouzid’s video below, this removes having to say, “Alexa” for each subsequent question that you ask in succession as there is now a five-second window where the mic stays on (when this setting is activated). I know it seems minor, but this is an important step for making communication with our assistants feel more natural.

The second, as reported by The Verge, was the introduction of Google’s multi-step smart home routines. These routines are an incremental improvement on the smart home as they allow for you to link multiple actions into one command. If I had a bunch of smart appliances all synced to my Google Assistant, I could create a routine built around the voice command, “Honey, I’m homeeee” and have that trigger my Google assistant to start playing music, turn on my lights, adjust my thermostat, etc. In the morning, I might say, “Rise and shine” which then starts brewing a cup of coffee and reading my morning briefing, the weather and the traffic report.

This will roll out in waves in terms of what accessories and android functionality are compatible with routines. Routines make smart home devices that much more compelling for those interested in this type of home setting.

The last piece of news pertains to the extent of which Amazon is investing in its Alexa division. If you recall, Jeff Bezos said during Amazon’s Q4 earnings call in February that Amazon would, “double down” on Alexa. Here’s one example of what “doubling down” might entail as Amazon continues to aggressively scale the Amazon Alexa division within the company:

As our smart assistants keep taking baby steps in their progression toward being true, personal assistants, it’s becoming increasingly clear that this is one of the biggest arms races among the tech giants.

-Thanks for Reading-



Biometrics, hearables, News Updates, Smart assistants, VoiceFirst

Pondering Apple’s Healthcare Move

Apple red cross

Outside Disruption

There have been a number of recent developments that involve impending moves from non-healthcare companies intending to venture into the healthcare space in some capacity. First, there was the joint announcement from Berkshire Hathaway, JP Morgan and Amazon that they intend to team up to “disrupt healthcare” by creating an independent healthcare company specifically for their collective employees. You have to take notice anytime you have three companies of that magnitude, led by Buffett, Bezos and Dimon, announcing an upcoming joint venture.

Not to be outdone, Apple released a very similar announcement last week stating that, “Apple is launching medical clinics to deliver the world’s best health care experience to its employees.” The new venture, AC Wellness, will start as two clinics near the new “spaceship” corporate office (the one where Apple employees keep walking into the glass walls). Here’s an example of what one of the AC Wellness job postings look like:

AC Wellness Job Posting
Per Apple’s Job Postings

So in a matter of weeks, we have Amazon, Berkshire Hathaway, JP Morgan and now Apple, publicly announcing that they plan to create distinct healthcare offerings for their employees. I don’t know what the three-headed joint venture will ultimately look like, or if either of these two ventures will extend beyond their employees, but I think that there is a trail of crumbs to follow to try and discern what Apple might ultimately be aspiring for.

Using the Past to Predict the Future

If you go back and look at the timeline of some of Apple’s moves over the past four years, this potential move into healthcare seems less and less surprising. Let’s take a look at some of the software and hardware developments over the past few years, and how they might factor into Apple’s healthcare play:

The Software Developer Kits – The Roads and Repositories

Apple Health SDKs

The first major revelation that Apple might be planning something around healthcare was the introduction of the software development kit (SDK), HealthKit, back in 2014. HealthKit allows for third-party developers to gather data from various apps on users’ iPhones and then feed that health-based data into Apple’s Health app (a pre-loaded app that comes standard on all iPhones running iOS 8 and above). For example, if you use a third-party fitness app (i.e. Nike + Run) developers could feed data from said third-party app into Apple’s Health app, so that the user can see all of the data gathered in that app alongside any other health-related data that was gathered. In other words, Apple leveraged third party developers to make their Health app more robust.

When HealthKit debuted in 2014, it was a bit of a head-scratcher because the type of biometric data you can gather from your phone is very limited and non-accurate. Then Apple introduced its first wearable, the Apple Watch in 2015, and suddenly HealthKit made a lot more sense as the Apple Watch represented a much more accurate data collector. If your phone is in your pocket all day, you might be able to get a decent pedometer reading around how many steps you’ve taken, but if you’re wearing an Apple Watch, you’ll record much more precise and actionable data, such as your blood pressure and heart rate.

Apple followed up on this a year later with the introduction of a second SDK, ResearchKit. ResearchKit allowed for Apple users to opt into sharing their data with researchers for studies being conducted, providing a massive influx of new participants and data which in turn could yield more comprehensive research. For example, researchers studying asthma developed an app to help track Apple users suffering from asthma. 7,600 people enrolled through the app in a six-month program, which consisted of surveys around how they treated their asthma. Where things got really interesting was when researchers started looking at ancillary data from the devices, such as geo-location of each user, to identify any possible neighboring data such as the pollen and heat index to identify any correlations.

Then in 2016, Apple introduced a third SDK called CareKit. This new kit served as an extension to HealthKit that allowed developers to build medically focused apps that track and manage medical care. The framework provides distinct modules for developers to build off of around common features a patient would use to “care” for their health. For example, reminders around medication cadences, or objective measurements taken from the device, such as blood pressure readouts. Additionally, CareKit provides easy templates for sharing of data (i.e. primary care physician), which is what’s really important to note.

These SDK Kits served as tools to create roads and houses to transfer and store data. In the span of a few years, Apple has turned its Health app into a very robust data repository, while incrementally making it easier to deposit, consolidate, access, build-upon, and share health-specific data.

Apple’s Wearable Business – The Data Collectors

Apple watch and airpods

Along with the Apple Watch in 2015 and AirPods in 2016, Apple introduced a brand new, proprietary, wearable-specific computer chip used to power these devices called the W1 chip. For anyone that has used AirPods, the W1 chip is responsible for the automatic, super-fast pairing to your phone. The first two series of the Apple Watch and the current, first generation AirPods use the W1 chip, while the Apple Watch series 3 now uses an upgraded W2 chip. Apple claims that the W2 chip is 50% more power efficient and boosts speeds up to 85%.

W1 Chip
W1 Chip via The Verge

Due to the size constraints of something as small as AirPods, chip improvements are crucial to the devices becoming more capable as it allows for engineers to allocate more space and power for other things, such as biometric sensors. In an article from Steve Taranovich from Planet Analog, Dr. Steven LeBoeuf, the president of biometric sensor manufacturer Valencell said, “the ear is the best place on the human body to measure all that is important because of its unique vascular structure to detect heart rate (HR) and respiration rate. Also, the tympanic membrane radiates body heat so that we are able to get accurate body temperature here.”

AirPods Patents.JPG
Renderings of AirPods with biometric sensors included

Apple seems to know this too, as they filed three patents (1, 2 and 3) in 2015 around adding biometric sensors to AirPods. If Apple can fit biometric sensors onto AirPods, then it’s feasible to think hearing aids can support biometric sensors as well. There are indicators that this is already becoming a reality, as Starkey announced an inertial sensor that will be embedded in its next line of hearing aids to detect falls. While the main method of logging biometric data currently resides with wearables, it’s very possible that our hearables will soon serve that role as they’re the optimal spot on the body to do such. A brand new use case for our ever-maturing ear computers.

AC Wellness & Nurse Siri

The timing for these AC Wellness clinics makes sense. Apple has had four years to build out the data-level aspect to their offering via the SDKs. They’ve made it both easy to access and share data between apps, while simultaneously making their own Health app more robust. At the same time, they now sell the most popular wearable and hearable, effectively owning the biometric data collection market. The Apple Watch is already beginning to yield the types of results we can expect when this all gets combined:

Pulmonary embolism tweet.JPG

To add more fuel to the fire, here’s how the AC Wellness about page reads:

“AC Wellness Network believes that having trusting, accessible relationships with our patients, enabled by technology, promotes high-quality care and a unique patient experience.”

“Enabled by technology” sure seems to indicate that these clinics will draw heavily from all the groundwork that’s been laid. It’s possible that patients would log their data via the Apple Watch (and down the line maybe AirPods/MFi hearing aids) and then transfer said data to their doctor. The preventative health opportunities around this type of combination are staggering. Monitoring glucose levels for diabetes. EKG monitoring. Medication management for patients with depression. These are just scratching the surface of how these tools can be leveraged in conjunction. When you start looking at Apple’s wearable devices as biometric data recorders and you consider the software kits that Apple is enabling developers with, Apple’s potential venture into healthcare begins making sense.

The last piece of the puzzle, to me, is Siri. What patients really now need, with all of these other pieces in place, is for someone (or thing) to understand the data they’re looking at. The pulmonary embolism example above assumes that all users will be able to catch that irregularity. The more effective way would be to enlist an AI (Siri) to parse through your data, alert you to what you need to be alerted to, and coordinate with the appropriate doctor’s office to schedule time with a doctor. You’d then show up to the doctor, who can review the biometric data Siri sent over.  If Apple were to give Siri her due and dedicate significant resources, she could be the catalyst to making this all work. That to me, would be truly disruptive.

Nurse Siri.jpg

-Thanks for Reading-


hearables, News Updates, Podcasts, Smart assistants, VoiceFirst

This Week in Voice – First Podcast Experience



This Thursday, I was fortunate to be invited by Bradley Metrock, the host of the podcast, “This Week in Voice,” to sit down with him and discuss the top stories of the week that pertain to Voice technology. I was joined by fellow panelist, Sarah Storm, who is the Head of the cloud studio, SpokenLayer, and the three of us went back and forth around what’s new in the VoiceFirst world.

The great thing about this podcast is that Bradley brings on a wide variety of people with different backgrounds on this show, so that each week you get a different perspective into the stories of the week. This week, we talked about the following five stories:

  1. New York Times: Why We May Soon Be Living in Alexa’s World
    This story serves as a revelation of sorts, as it’s the realization that Alexa, and the other smart assistants, are not just merely new gadgets, but represent a shift in how we communicate with computers as a whole.
  2. VoiceBot.ai: Spotify Working on New Smart Speaker? 
    The fact that Spotify posted two separate job openings for senior positions around a new hardware division turned a lot of heads. This is particularly interesting given the impending IPO, as Spotify might be looking to make some pretty dramatic moves prior to going public. Would Spotify be better off vertically integrating itself via partnerships/acquisitions, or is it possible for them to create a hardware division from scratch?
  3. Forbes: Meet the Voice Marketer 
    Voice represents an entirely new opportunity for brands to market themselves, but the question is how best do you use this new medium? With more personal data than ever at many of these brands’ disposal, it will be a challenge to balance the “creepy” with the truly proactive and engaging.
  4. Satire (I think): Local Man to Marry Alexa

  5. The Voice of Healthcare Summit
    Held at the Martin Conference Center at Harvard Medical School in Boston this August, this summit promises to be one of the best opportunities to gather with fellow Voice enthusiasts and healthcare professionals, to collaborate and learn about applying Voice to healthcare. This will be an awesome event and I encourage anyone to go who thinks this might be up their alley!

This was a great experience getting to sit in on this podcast and chat with Bradley and Sarah. I hope you enjoy this episode and cheers to more in the future!

Listen via:

Apple Podcasts;  Google Play Music; Overcast; SoundCloud; Stitcher Radio;  TuneIn

-Thanks for Reading-


hearables, Smart assistants, VoiceFirst

The Great #VoiceFirst Debate

Twitter guys.jpg

Monday night, Twitter proved yet again that despite all its shortcomings, it is still the king of where some of the best discussions and debates go down for all to see. Normally, I wouldn’t base a blog post around a twitter debate, but this specific thread was a culmination of a lot of content and discussion over the past few weeks around smart assistants, the smart home and the overall effort to understand how Voice will evolve and mature. Given who was debating and the way it dovetails so nicely from the Alexa Conference, I thought it worthy of a blog post.

Before I jump into the thread, I want to provide some context here around some of the precursors to this discussion. This really all stems from the past few CES, but mainly the most recent show. To start, here’s a really good A16z podcast by two of the prominent people in the thread, Benedict Evans and Steven Sinofsky, talking about the smart home coming out of this year’s CES and the broader implications of Voice as a platform:

As they both summarize, one of the main takeaways from CES this year was that seemingly every product is in some way tied to Voice (I wrote about this as the, “Alexification of Everything“). The question isn’t really whether we’re going to keep converting our rudimentary, “dumb” devices into internet-connected “smart” devices, but rather, “what does that look like from a user standpoint?” There are a LOT of questions that begin to emerge when you start poking into the idea of Voice as a true computing platform. For example, does this all flow through one, central interface (assistant) or multiple?

Benedict Evans followed up on the podcast by writing this piece, further refining the ideas on the above podcast, and tweeted out the article in the tweet below. He does a really good job of distilling down a lot of the high-level questions, using history as a reference, to contest the validity of Voice as a platform. He makes a lot of compelling points, which is what led to this fascinating thread of discussion.

Benedict Evans Smart Home and Veg

To help understand who’s who in this thread, the people I want to point out are as follows: Benedict Evans (a16z), Steven Sinofsky (board partner @ a16z), Brian Roemmele (Voice expert for 30+ years), and Dag Kittlaus (co-founder of Siri and ViV). Needless to say, it’s pretty damn cool to be chilling at home on a Monday night in St. Louis and casually observe some of the smartest minds in this space debate the future of this technology out in the open. What a time to be alive. (I love you Twitter you dysfunctional, beautiful beast.)

So it starts off with this exchange between Dag and Benedict:

Dag Tweet 1Dag Tweet 2

BE Tweet 1`.JPG

As Dag points out, once the 3rd party ecosystem really starts to open up, we’ll start seeing a Cambrian explosion of what our smart assistants can do via network effects. Benedict, however, is alluding to the same concern that many others have brought up – people can only remember so many skills, or “invocations.” It’s not sustainable to assume that we can create 1 million skills and that users will be able to remember every single one. This guy’s response to Brian encapsulates the concern perfectly:

Brian and Alex

So what’s the way forward? Again, this all goes back to the big takeaway at the Alexa Conference, something that Brian was hammering home. It’s all about the smart assistant having deeply personalized, contextual awareness of the user:

Benedict Dag and Brian

“The correct answer or solution is the one that is correct to you.” This is the whole key to understanding what appears to be the only way we move forward in a meaningful way with #VoiceFirst. It’s all about this idea of the smart assistant using the contextual information that you provide to better serve you. We don’t need a ton of general data, each person just needs their smart assistant to be familiar with their own personal “small data.” Here Dag expands on this in his exchange with Steven:

Steven and Dag

So when we’re talking about, “deeply personalized, contextual awareness” what we’re really saying is whether the smart assistant can intelligently access and aggregate all of your disparate data together and understand the context in which you’re referring to said data. For example, incorporating your geo-location to give context to your “where” so that when you say, “book me on the first flight back home tomorrow,” your smart assistant will understand where you are currently by using your geo-location data, and where “home” is for you based on a whole different set of geo-location data that you’ve identified to your assistant as home. Factor in more elements like all of the data you save to your airline profiles, and the assistant will make sure you’re booked with all your preferences and your TSA-precheck number included. Therefore, you’re not sitting there telling the assistant to do each aspect of the total task, you’re having it accomplish the total task in one fell swoop. That is a massive reduction in friction when you subtract all the time you spend doing these types of tasks manually each day.

I don’t think we’re really talking about general AI that’s on par with Hal-9000. That’s something way more advanced and something that’s probably much further out. In order for this type of personalized, contextual awareness to be enabled, the smart assistant would really just need to be able to quickly access all of the data you have stored in disparate areas of your apps together. Therefore, APIs become essential. In the example described above, your assistant would need to be able to API into all of your apps (i.e. Southwest app where your profile is stored and Google’s app where you have indicated your “home” location) or the 3rd-party skill ecosystem whenever a request is made. Using what’s already at its disposal via API integrations from apps, in conjunction with retrieving information or functions built in skills. Therefore, the skill ecosystem is paramount to the success of the smart assistant as they serve as entirely new functions that the assistant can perform.

It’s really, really early with this technology so it’s important to temper expectations a bit. We’re not at this point of “deeply personalized, contextual awareness” just quite yet, but we’re getting closer. As a random observer of this #VoiceFirst movement, it’s pretty awesome to have your mind blown by Brian Roemmele at the Alexa Conference talking about this path forward, and then even more awesome to have the guy who co-founded Siri & ViV completely validate everything Brian said a few weeks later on Twitter. I think that as Benedict and Steven pointed out, the current path we’re on is not sustainable, but based on the learnings from Brian and Dag, it’s exciting to know that there is an alternate path ahead to keep progressing our smart assistants forward and bring this vision to life that is much more rich and intuitive for the user.

Eventually, many of us will prefer to have our smart assistants handy all the time, and what better a spot than in our little ear computers?

-Thanks for reading-


Conferences, hearables, Smart assistants, VoiceFirst

The Alexa Conference Blew my Mind

Alexa Conf Icon

Last Thursday, I was fortunate for the opportunity to travel to Chattanooga, TN to attend the second annual Alexa Conference and join a group of some of the smartest people working on Voice technology.  The cool thing about the Alexa Conference is that it’s not sponsored by Amazon (or Google or any other major tech company), it’s fully-independent, sponsored by third parties, and therefore it truly feels objective and unbiased. The attendees and speakers ranged from third party “skill” agencies, skill developers (domestic and international), certified Alexa champions, skill analytic and diagnostic providers, a representative from the FTC, insurance and healthcare reps, to futurists, Internet of Things specialists, digital transformation experts, behavioral economists, doctors, PhD scientists, former NASA employees, and a random dude from the Audiology industry who writes a blog called FuturEar.

I have been following the #VoiceFirst movement, which includes progress in the Voice User Interface (VoiceUI), the devices that house our smart assistants (smart speakers, smartphones and wearable technology), devices that work in conjunction and respond  to smart assistants (Internet of Things), and our smart assistants as a whole for the past few years. I think I may have learned more in the 48 hours that I attended this conference than I have in the thousands of hours leading up to it. Ok, that’s probably some hyperbole there, but there was a ton of insight and these were my favorite takeaways from the show:

Context is King

One of the big questions that I had heading into Chattanooga was, “how do we take this all to the next level?” I now have the answer and it all derives from context. Deep, personalized contextual awareness. What does that mean? Well, for starters, let’s establish that smart assistants feed and grow stronger on “personal data.” The only way that these assistants ever get any more useful or “smarter” is by learning more about us.

Brad Metrock and Brian Roemmele
Bradley Metrock interviewing Brian Roemmele

A really good way to think about this is through the lens of ordering your favorite pizza. My favorite pizza (shoutout Imo’s) is probably pretty different than your favorite pizza. The web, as we know it, is built on a pay-per-click model, so when I search on Google for pizza options around me, the results are going to show advertised options at the top. These are not in any way personalized to me and therein lies the big difference. When I’ve ordered pizza 20 times through my smart assistant, 15 of which have been Imo’s, and then I’m in Chattanooga for work (where Imo’s does not exist) and I want to order a pizza, my smart assistant will provide me results similar to Imo’s in Chattanooga. The smart assistant knows my preferences and therefore will actively distill the options for me to cater to my personal preferences.

Taking it one step further, think about all the other personal information that you probably share or are having shared with you that can broaden the assistant’s contextual awareness. If your friends have been to Chattanooga and they raved about a pizza spot on Instagram months or years ago, your smart assistant could retrieve that and factor that into your results. So now it’s not just based on your own pizza preferences, but also factoring in other variables such as your friends’ experiences and preferences.

169 Labs
Dominic Meissner and Tim Kahle of 169 Labs

This begins to bring privacy and security front and center. One of the really interesting presentations was from the German guys at 169Labs. While the attitude in the US around privacy is pretty lax and apathetic, it’s important to understand that our attitude here in the States is quite different than how many Europeans feel. They take their privacy way more seriously and it’s a top of mind issue that permeates in any tech discussion. Privacy will continue to be a topic of discussion as our smart assistants evolve and we become increasingly more aware of just how much data we are sharing. I believe the pros outweigh the cons when it comes to sharing your personal data with your smart assistant(s), but the key is going to be feeling safe that it is all encrypted and protected from being hacked.

The beginnings of Conversational Interfaces

One of the more frustrating aspects of smart speakers and smart assistants is the lack of continuity. Currently, our smart assistants function in a way that is more or less executing single commands or single questions. There isn’t really any dialogue, it’s typically, “Alexa shuffle my playlist” or “Alexa set a timer for 15 minutes” or “Alexa what’s the weather?” or “Alexa how many ounces are in a gallon?” Asking a question or issuing a command and having the device turn off afterward is not the goal for our smart assistants. Brian Roemmele compared this level of sophistication to the command line in the PC era. It’s super primitive and we’re in the first inning of a double-header with this technology.

Instead, what we need is, again, contextual awareness in order to have a dialogue. Katie McMahon of Soundhound did an awesome job demoing Soundhound’s own smart assistant, Hound, with some real contextual awareness:

So she starts off by saying, “Show me Asian restaurants, excluding Japanese and Chinese, that are open right now.” An accomplishment in itself that Hound so quickly answered accurately. Then she goes on to further refine the search, “Ok Hound, show those with outdoor seating.” The key word there is, “those,”  as the assistant is now aware of the context because it recognizes that “those” is a demonstrative pronoun representing the Asian restaurants from the previous query. This is HUGE! In a dialogue with another person, you’re constantly using pronouns and language that references context from earlier in the conversation or conversations prior. It’s an essential part to how we communicate and we’re already seeing with smart assistants like Hound demonstrate that these assistants are more than capable of this type of complex contextual awareness. Without this ability, I doubt smart assistants will ever be taken that seriously.

Next, she goes one step further. Katie says, “I need an Uber to the first one.” So not only does the assistant recognize that “the first one” is in reference to the first result from the previous search, but its capable of using multiple “domains” or skills in conjunction. This is a significant step forward on something that we’re pretty limited with today. How many people would have been able to accomplish that all in one app on their phone? Most likely, you’d use multiple apps like Yelp, grab the address of where you want to go and pop it in Uber. Furthermore, if your assistant is factoring in more data for a more personalized result list, by retrieving your friends’ Instagram and/or Swarm data, then we’re comparing that to you going into each app and hunting for the relevant posts. This is clearly an improvement in time and efficiency.

It’s honestly amazing what’s going on over at Hound and the ability its assistant has with retaining information throughout the dialogue. It was the first time I really saw this level of continuity and contextual awareness in a smart assistant and it made me very optimistic about the potential of smart assistants when they’re able to have this type of dialogue. When I wrote about the key to the adoption of a new user interface being the reduction of friction, this is ultimately what I was referring to. Even a primitive conversational interface would still dramatically reduce the time that we currently spend app toggling, tapping and searching on our phones for many things. We’re moving far, far beyond just using assistants to set timers.

(Here’s a link to another video of Katie asking Hound for hypothetical monthly mortgage payments based on home values, interest rates and down payment percentages. I was blown away by these demos and have been using Hound since I’ve gotten back. It’s encouraging to see that all of this innovation is not limited to just Amazon, Google, Facebook and Apple.)

Proactive & Personalized

Another huge advantage of feeding your smart assistant personal data is that it can begin to proactively engage you on what it’s finding. This is where the Internet of Things (IoT) begins to get really interesting. If you have a smart fridge and your smart assistant is accessing that data, it then knows stuff like your fruit being spoiled. Couple that with the possibility that you’ve given your assistant access to all of your calendar data, so it knows that you have a dinner party that night and you’ve, in some way or another, logged that you need the fruit for your recipe for said dinner party. So, what we’re moving toward is a scenario where the smart assistant in your car or hearable pings you and says something along the lines of, “Hey Dave, it looks like your blackberries have gone bad and you need them for your dessert recipe for Julie’s dinner party tonight. It looks like they’re currently on sale at Mariano’s down the street, would you like to go?” and then navigate you down to grocery store.

This was a big aha! moment for me. So much of #VoiceFirst is happening in disparate areas that it’s hard to bring all of it together into one specific use case like I just mentioned above. When they’re silo’d off on their own, you hear, “smart fridge” and you think, “Really? What’s the point of that?” But when you start looking at all of these IoT devices as data entry points for your smart assistant, which your assistant can then actively retrieve, assess and then provide actionable insight (all very quickly), it all then becomes a whole lot more compelling. This is, “small data” as opposed to big data. It’s personal to each of us and therefore invaluable to each of us. This opens the door to serendipity and assistants proactively providing you with suggestions and reminders from data that you likely aren’t even aware of.

Some other Takeaways

  • Brian Roemmele harped a lot on the idea of  “enabling creatives to enter into the fold.” He used the analogy of Steve Jobs empowering the graphic designers with the iPhone, as the iPhone created a massive abundance of opportunity for that profession. The same will be done with voice for many more creative types that include comedians, poets, psychologists, storytellers, artists, historians, writers, etc. Therefore we need to a set of tools that are easy enough for anyone to use and create with.
  • VoiceXP demonstrated a number of unique skills specifically for the Echo Show.  I also appreciated that Bob Stolzberg really emphasized the fact that the Echo Show is version one of Amazon’s multi-modal strategy. We’re quite literally scratching the surface here with what’s possible when you add in screens and mixed-modality into the #VoiceFirst equation. Some really exciting opportunities around this.
Mark Tucker and Bob Stolzberg of VoiceXP
  • Keynote speaker Ahmed Bouzid presented a plethora of fascinating facts and charts, but the one that stood out to me were two of the demographics that can benefit the most from a #VoiceFirst world: seniors and physically incapacitated. This is at the heart of why I’m so passionate about spreading awareness to the #audpeeps, audiology and hearing aid industry about #VoiceFirst. Smart assistant integration is coming to hearing aids and this new use case for hearing aids, hearables and all our ear-computers, stands to really benefit those who struggle in a mobile world. Mobile computing is not conducive to these two demographics and I’m ecstatic about the possibility that these demographics will soon be empowered in a way they never have before. It’s an awesome value-add that you can advocate for free and it will dramatically improve the patient experience over time.

This was an amazing two days and I’ll definitely be back next year to continue to gather everything I can about what’s happening in the #VoiceFirst world and how they’ll ultimately impact those of you who work with the little computers that go in the ear.

-Thanks for reading-