As we celebrate the 10 year anniversary of the App store this week, it seems only natural that we begin wondering what the next 10 years will look like. What modalities, devices, interfaces and platforms will rise to the top of our collective preferences? There’s clearly an abundance of buzzwords that are thrown around these days that indicate a potential direction things may go, but the area that I want to focus on is the Voice interface. This includes smart assistants and all the devices they’re housed in.
Gartner’s L2 recently published the chart below, which might seem to pour some water on the momentum that has been touted around the whole Voice movement:
Before I go into why this chart probably doesn’t matter in the grand scheme of things, there were some solid responses as to why these trend lines are so disparate. Here’s what Katie McMahon, the VP of SoundHound, had to say:
One of the primary reasons the app economy took off was due to two-sided network effects predicated on developer buy-in based on huge monetary incentive. Of course there was an explosion of new applications and things you could do with your smartphone, as there was a ton of money to be made to develop those apps. This was a modern day gold rush. The same financial incentive around developing voice skills doesn’t yet exist.
So, based on Chris and AnswerLabs’ research, a third of users don’t really know that an “app-like economy” exists for their smart speakers. That’s rather startling, given that it was reported by Voicebot at the end of June that there are now 50 million smart speaker users in the US. Is it really possible that tens of millions of people don’t fully understand the capabilities and the companion ecosystem that comes with the smart speaker that they own? It would appear so, as the majority of users are using their smart speakers for native functionality that doesn’t require a downloaded skill as illustrated by this awesome chart from Voicebot’s Smart Speaker Consumer Adoption Report 2018:
As you can see from the chart above, only 46.5% of respondents from this survey have used a skill/action.
Jobs to be Done
In order to understand how we move forward and what’s necessary to do so, it’s important to look at how we use our phones today. As I wrote about in a previous post, each computer interface evolution has been a progression of reducing friction, or time spent doing a mechanical task. Today’s dominant consumer interface – mobile – is interfaced with Apps. Apps represent jobs that need doing, whether that be a tool to get us from A to B (maps), filling time when you’re bored (games/social media/video), exercising or relaxing the mind(sudoku/chess/books/music/podcasts), etc. Every single app on your phone is a tool for you to execute the job you’re trying to accomplish.
So, if we’re looking to reduce friction as we enter into a new era of computing interaction, we should note that the majority of friction with mobile is primarily consolidated around the mechanical process of pulling out your phone, digging through and toggling between your apps to achieve the job needing to be done. That mechanical process is the friction that needs to be removed.
Workflow + Siri Shortcuts
I was initially underwhelmed by Apple’s WWDC this year because I felt that Siri had once again been relegated to the backseat of Apple’s agenda, which would be increasingly negligent given how aggressive Amazon, Google and the others have been moving into this area. What I didn’t fully understand was how crucial Apple’s Workflow acquisition was back in 2017 and how it might apply to Siri.
Siri Shortcuts ultimately represent a way in which users can program “shortcuts” between apps, so that they can execute a string of commands together into a “workflow” via a voice command. The real beauty of this is that each shortcut can be made public (hello, developers) and Siri will proactively suggest shortcuts for you based on what Siri learns about your preferences and contextual behavior. Power-users empowering mainstream-users with their shortcuts, as suggested by Siri. Remember, context is king with our smart assistants.
Brian Roemmele expanded on this acquisition and the announced integration of Workflow with Siri on Rene Ritchie’s Vector podcast this week. Brian said something in this podcast that really jumped out at me (~38 min mark):
“Imagine every single app on the app store. Now deconstruct all those apps into Jobs to be Done, or intents, or taxonomies. And then imagine, with something like crayons, you can start connecting these things artistically any way you want… Imagine you can do that without mechanically doing it.”
This cuts right to the core of what I think the foreseeable future looks like. Siri Shortcuts powered by Workflow take the role of those crayons. If we’ve extracted out all the utility and jobs that each app represents and put them together into one big pile, we can start to combine various elements of different apps to result in increased efficiencies. This to me really screams “removing mechanical friction.” When I can speak one command and have my smart assistant knock out the work I’m currently doing when I’m digging, tapping and toggling through my apps, that’s significant increases in efficiency
- “Start my morning routine” – starts my morning playlist, compares Lyft and Uber and displays the cheaper (or quicker, depending on what I prefer) commute, orders my coffee from Starbucks, and queue’s up three stories I want to read on my way to work.
- “When’s a good time to go to DC” – pulls together things like airfare, AirBnB listings, events that might be going on at the time like concerts or sports games surfaced from Ticketmaster/SeatGeek/Songkick, weather trends, etc.
The options are up to one’s imagination and this interface really does begin to resemble a conversational dialogue as the jobs that need to be done become increasingly more self-programmed by the smart assistant over time.
All Together Now
Apple isn’t the only one deploying this strategy; Google’s developer conference featured a strikingly similar approach to unbundling apps called Slices and App Actions. It would appear that the theme here heading into the next 10 years is to find ways to create efficiencies by leveraging our smart assistants to do the grunt work for us. Amazon’s skill ecosystem is currently plagued by discovery issues as highlighted above, but the recent deployment of CanFulfillIntentRequest for developers will hopefully allow for easier discovering of skills and functionality for mainstream users. The hope is that all the new voice skills and the jobs that they do can be surfaced much more proactively. That’s why I don’t fixate on the amount of skills created to this point, because the way in which we effectively access those skills hasn’t really matured yet.
What’s totally uncertain is whether the companies that sit behind the assistants will play nice with each other. In an ideal world, our assistants would specialize in their own domains and work together. It would be nice to be able to use Siri on my phone, which would work with Alexa when I’m needing something from the Amazon empire or control an IoT-Alexa based device. It would be great if Siri and Google Assistant communicated in the background so that all my gmail and calendar context was available for Siri to access.
It’s possible that we’ll continue to have “silos” of skills and apps, and therefore silos of contextual data, if the platforms aren’t playing nice together. Regardless, within each platform the great unbundling seems to be underway. As we move towards a truly conversational interface where we’re conversing with our assistants to accomplish our jobs to be done, we then should think about where we’re accessing the assistant.
I’m of the mind that as we depend on our smart assistants more and more, we’ll want access to our assistants at all times. Therefore, I believe that we’ll engage with smart assistants across multiple different devices, but with continuity, all throughout the day. I may be conversing with my assistants in my home via smart speakers or IoT devices, in my car on the way to work, and in my smart-assistant integrated hearables or hearing aids throughout the course of my day while I’m on-the-go.
While the past 10 years was all about consolidating and porting the web to our mobile devices via apps, the next 10 might be about unlocking new efficiencies and further reducing friction by unbundling the apps and allowing our smart assistants to operate in the background doing our grunt work and accomplishing for us the jobs we need done. It’s not as if smartphones and tablets are going to go away, on the contrary, but its how we use them and derive utility from them that will fundamentally change.
-Thanks for Reading-