Alexa, Aural Attention Economy, Daily Updates, Future Ear Radio, Google Assistant, Hearables, Podcasts, Siri, VoiceFirst

070 – Collin Borns – Speechly: Enabling Voice User Interfaces Everywhere

This week on The Future Ear Radio podcast, I’m joined by my old friend Collin Borns, Head of Business Development at Speechly. Collin and I met a few years ago at one of the voice tech conferences, and he’s one of my go-to people to bounce ideas off of as to where the voice tech space is headed. Now that he’s at Speechly and acclimated to his role as Head of Biz Dev there, I wanted to bring him on the podcast to talk about Speechly and how it’s a great representation of the type of opportunities that are very viable in today’s state of voice technology.

Our conversation spans a number of talks including:

  • How Google, Amazon and Apple have shaped our thinking of voice tech
  • The democratization of voice-based applications for websites and mobile applications (when will we see the “Google” moment?)
  • The mass enablement of voice user interfaces (VUI) across the internet and how Speechly’s automatic speech recognition engine works in tandem with its natural language processing engine (in real-time).
  • Why layering VUI onto eCommerce webstores is such a natural fit
  • Bank of America’s Erica adoption as a proxy for greater demand for voice-enabled functionality in apps

Always great to talk all things voice tech with Collin!

-Thanks for Reading-
Dave

EPISODE TRANSCRIPT

Dave Kemp:

Hi, I’m your host, Dave Kemp, and this is Future Ear Radio. Each episode we’re breaking down one new thing, one cool new finding that’s happening in the world of hearables, the world of voice technology. How are these worlds starting to intersect? How are these worlds starting to collide? What cool things are going to come from this intersection of technology? Without further ado, let’s get on with the show. Okay, so we are joined here today by Collin Borns. Collin, tell us a little bit about who you are and what you do.

Collin Borns:

Yeah, thanks for having me, Dave. So I am the head of business development in the US for Speechly. And I kind of got into voice technology or my interest in voice tech started back in college really from the perspective of a guy who was always looking for the next thing, always interested in starting my own little businesses and through college became familiar with voice and saw this as my way to be a part of this next revolution of interesting technology and really became obsessed with how can I go and learn about this as much as possible, which ultimately led me to VaynerMedia working on the finance side of things, but gave me the ability to see different perspectives from an agency that was known for building different voice projects, and ultimately, gave me the opportunity to meet a lot of really interesting people in the space.

Collin Borns:

That ultimately led to doing some investing at a small venture fund called VoicePunch that’s completely focused on the audio, and voice tech space. So diligence and looked at hundreds and hundreds of companies over my year and about a year and a half time with Marc and Cinta over at VoicePunch. And now I’m, like I said at Speechly where we help developers add voice interfaces to existing websites, mobile apps. So really excited to chat about that.

Dave Kemp:

Yeah, it’s great to be chatting again. Some of you that are listening might remember Collin from some of the different content projects that I was doing during the pandemic, we actually launched a newsletter, when we were doing a bunch of videos and all kinds of content together under the umbrella of Sonic Incytes. So I’ve known Collin for a few years now, got to know him through the various voice conferences that we both attended.

Dave Kemp:

And I think he’s definitely one of the people I always look to, to kind of bounce ideas off of about what’s going on in this space, and kind of the evolution of it. He’s just really, really knowledgeable, I think about the overall trajectory of it. And so I wanted to have you on today, because I’ve been… Now that you’re at Speechly, I really want to better understand this company from what I’ve gathered in the offhand conversations that we’ve had, it sounds really interesting, and sort of in a new direction that, I’m really curious to explore today throughout this conversation.

Dave Kemp:

So good to have you back here in the hot seat with me talking all through this, so let’s start with this whole idea you had mentioned the role that Big Tech played, I think initially here in the, let’s call it these first formative years of the voice tech space. Can you kind of outline what you were describing to me as to the role that they played? Why those… Some of the benefits of that, but also maybe some of the ways that that’s warped our thinking a little bit about this whole space?

Collin Borns:

Yeah, for sure. So I think, as you alluded big technology companies, specifically Amazon, and before that obviously Apple with Siri, I think that they reinvigorated or starting with Siri, it really brought a lot of attention to wow, we can we can talk to our technology. But there was this bar that was set really high with that announcement of that, and the user expectations that went with that just quite frankly, were at such a level that I think it was missed at a point where Siri became kind of a meme of… Not the greatest meme in the world of like, man, what is this thing really useful for?

Collin Borns:

And then with Alexa and Amazon coming out, it really reinvigorated this conversation around voice technology, and quite frankly, I think, just gave permission to other communities, other businesses to invest money into this space, invest time and resources into exploring, what does this voice technology do? However, it was from the perspective of again, these smart speakers, and I think that led to just as a constraint of the devices themselves, this approach to voice technology now that is completely from this paradigm of conversational experiences.

Collin Borns:

So like you said, I don’t think that we are at the point that we’re at having this conversation today about where voice technology can go without Amazon coming back in 2015, I think it was with the initial Alexa product. I don’t think we’re at this point without that. However, just because that happened doesn’t mean that we shouldn’t revisit all the different work that has been done and really try to understand where voice technology is best suited to be applied to different businesses.

Collin Borns:

And so I think, now we’re at this point where you have to look outside of just voice assistants and smart speakers to I think get a better idea of where there can be just wider opportunities for other businesses to look at this voice technology.

Dave Kemp:

That’s really well said, and I think that you’re right, because without having the critical mass that Amazon sort of pushed the ball down the hill, and then Google then hops on board at one time, Cortana with Microsoft, and then Samsung, with Viv, and then Bixby. And so I agree with you that I think that this whole space sort of did rally behind a company like Amazon that was uniquely suited, because I think that you needed to kind of almost have that consumer success that smart speakers had, that kind of surprised people.

Dave Kemp:

And it started to build a narrative of this is a new modality, but this is at the core, I think of where a lot of people, a lot of confusion break or a lot of the breakdown comes from is, it’s really hard to even sort of quickly surmise, what exactly does voice technology entail because everybody sort of has a different description of it. It’s conversational UI or conversational AI, it’s the voice user interface. It’s all these different things. It’s all these different devices.

Dave Kemp:

And I think that what’s really interesting, and I want to now go into Speechly a little bit is this notion of really making voice largely accessible across the internet. I think that’s maybe where this is going is it’s almost a layer to some degree of being able to be put on websites and mobile applications. So I know that’s kind of in your all’s wheelhouse. So can you just speak to maybe what really interested you initially was Speechly and then like that just segue into the broader picture of what you guys are doing?

Collin Borns:

Yeah. And maybe to better frame it, I don’t want to jump away from the voice assistants and smart speakers yet, because I think it provides good context to understand, what just voice technology I guess at a high level is good at. Because, again, we can’t ignore that these are in a third of the homes in the US and growing, and it gives us an insight into how are people using these devices? Again, like what is voice tech good at. And so, like I said, a little bit of context, there’s been over 100,000 of these voice apps made on smart speakers like the Alexa platform, however, these top five use cases are still first party experiences. It’s been this way for the last three years.

Collin Borns:

And if you look at these experiences, they’re not conversational, it’s the listening to music, it’s asking a question, it’s checking the weather, setting a timer, setting an alarm. So it’s voice command and control. None of these really need a back to back sort of voice experience. And so like I said, I was previously investing with VoicePunch and got the opportunity just to see so many different approaches and angles to voice technology. However, a lot of those were from this perspective of conversational experiences.

Collin Borns:

And what really drew me to Speechly specifically was just quite frankly, this differentiated approach to voice technology and approaching from first principles of like, what is voice technology good at? Voice tech is a really good interface to get things done efficiently. And that’s where we know it’s at right now. So I think that there is a path to conversational experiences, and certainly there are areas where hands off conversational experiences make a lot of sense today.

Collin Borns:

However, I think they’re very limited to a point that we should focus all our attention on cracking the nut for the top killer use case there doesn’t seem like the best use. I’d propose a better use of time is to say, “Okay, we know what voice is good at today. Can we do this in our own sort of domains, websites, mobile applications?” And the answer is yes. And this is why I ultimately wanted to join Speechly is just being able to take this differentiated approach to the market. And quite frankly, being able to build these voice experiences that are not constrained by some of the platforms that popularize them.

Dave Kemp:

Yeah, that’s again, well said. Because I think that to your point, and it should be caveated that, I’ve had people on this podcast before of companies that are actually building really ambitious things on these platforms, you know the guy like Ian and the group over at Bamboo learning and Matchbox with all the games and stuff like that.

Dave Kemp:

So it’s not to say that there’s not like this potential of building on these platforms. I think there is. But I agree with you that I always think that this period that we’re in is almost akin to like, the pre-internet PC, it’s like we’re the internet before Google. It’s just think back to the way that we interacted with computers, like before a lot of the big breakthroughs, which enabled lots and lots of other things like yeah there were definitely applications and different things that were built throughout all these epochs throughout computing history.

Dave Kemp:

But to your point, I think that where a lot of the potential is reexamining what is the technology in its current state good at. And this is, I think a really interesting thing to think through. So, in your opinion at Speechly, what are some of these things? I think you mentioned, those five use cases, those are great. But build on this a little bit. Where’s your mind at when it comes to this?

Collin Borns:

Yeah, again, I think, to really simplify it, I truly believe that this command and control sort of paradigm with voice can be applied to any sort of mobile or web application, I think in the same way, we’re starting to see the play button show up on pretty much any sort of publishing site, or any sort of a blog or media site, I think you’re going to start to see these buttons to communicate with these websites. And again, I think to frame where I see opportunity with Speechly’s approach to voices. It’s also again, I think this notion around voice apps being conversational experiences are really holding back our industry.

Collin Borns:

And so I think maybe it’d be helpful to kind of look at some of those… There are two big problems that I think are holding back some of the existing voice apps today. And it starts with the turn based experience. And, again, this is not turn based from the perspective of input in, input out and just like the fact that you’re having a conversation, but I’m talking at the technology level of a voice assistant.

Collin Borns:

They’re unable to understand the user until after they’ve finished talking. So this is a result of, again, how they handle a user speech in some sequential manner. So it goes from transforming speech to text, then that text is given some sort of meaning, then that meaning’s giving some sort of action. And that action is then given some sort of feedback to the user through text to speech.

Collin Borns:

And so you have inherently this experience that is turned based, and leads to a lot of friction for the user, because they have no clue they’re being understood until they finish their whole sentence. This is why we see these like really short utterances that are successful on voice systems, because you don’t want to say a 10-second long thing and have no clue if it’s listening to you or understanding you. Right?

Dave Kemp:

Right.

Collin Borns:

So there’s a constraint there. And then there’s also this notion, like I said, around voice apps needing to be conversational experiences. And so I just don’t think that if, again, using those top five use cases, as a reference for smart speakers, you can clearly see that users find value from voice and just being a really efficient way to get a task done. And so taking that mindset, not this idea that I’m going to build a conversational, it’s like flow back and forth. But taking that mindset of voice is good at getting these tasks done efficiently, and then applying that framework to your business.

Collin Borns:

And going in from that mindset as opposed to this mindset of what sort of conversation can I build, I think is a lot more or will lead to a lot better results for people that are exploring voice technology. And like you said, the platform’s themselves, you mentioned two different sort of entertainment use cases and I think that’s one of the things that will play out on those platforms actually is entertainment, it makes sense.

Collin Borns:

But again, some of these more utility-based activities that could be done for consumers or professional users alike, people in a warehouse, for example, these things can be applied and lead to actual utility, as opposed to just kind of imagining where this conversational technology might go.

Dave Kemp:

No, I agree with you. Because I think that, again, it comes back to like the current state. And I think that in the same way, again, that over the course of time, more breakthroughs enabled different types of functionality. So a lot of these very aspirational ideas very well may exist in the future. But I agree I think it’s important to really focus on the here and now. Because I think there’s a lot of interesting use cases that are, like you said. I think the name of the game with a lot of these things is speed.

Dave Kemp:

And so I’ve had a lot of conversations about why that’s at the root of why I think media is such an interesting use case. Because again, it’s like, if you have to really kind of search and dig to get what you’re looking for, that ability to just issue an input and be able to quickly gather whatever sort of content from the archive, that makes sense to me, that seems like that’s significantly better than the incumbent method.

Dave Kemp:

But there’s a lot of little things like that, that exists out there. So maybe let’s start with websites in general, kind of like, what are some of the things that you think are not to use the cliche, but like low hanging fruit? What are some of those really obvious things, in your opinion, where it’s like, that’s a really easy thing that you could layer voice on to?

Collin Borns:

Yeah, so I think command and control, like I said, in like an eComm setting. And we have a good demo of this on our website that we have that can show this. But you mentioned this interesting point, you keep asking questions and I like to take a step back. But you mentioned this focus on like the speed of this solution. And so that’s actually kind of where Speechly is unique from some of the other offerings on the market. Like I mentioned, there’s a sequential manner of how voice assistant handles your input, and it goes from a speech recognition to natural language understanding to text to speech back.

Collin Borns:

What Speechly is able to do is do that speech recognition, and that natural language understanding simultaneously in real time, so there is no lag, there is no latency, it gives the developer the ability. For example, if you are searching on an eComm site, let’s say. If you were going to buy a pair of shoes, or you were going to search for a pair of shoes on the Alexa platform, maybe like let me phrase it a different way, would you ever try and go and make a purchase of athletic apparel or something on an Alexa platform or do you think that would lead to a lot of friction?

Dave Kemp:

I think that today unless I’m looking at it, first of all, I think without a screen no way. I think that I’ve said this before, multimodal is ultimately going to maybe be part of the catalyst to voice commerce, because I think you need to see, I just don’t think that buyers at large will buy things without really being able to look and almost really be confident in what they’re buying. So that’s going to be one of the biggest limitations. But I really do like this idea of eCommerce element here because I think that eCommerce is such a… It’s obviously a mainstay now, we’ve all very much become conditioned.

Dave Kemp:

And that’s where I’ve always thought like Apple or Amazon’s kind of well suited. I kind of think that this is just been a in some ways, I think Alexa is kind of a facade to get you to use Amazon Pay to buy things through Amazon. But that said, I think that this applies broadly across eCommerce, like I could see lots of Shopify sites having this kind of capability. So that’s what I would say here is, I actually think that it’s pretty compelling so long as it’s multimodal.

Collin Borns:

Yeah, exactly. I then also poke at this bringing in this element of conversation, the multimodal thing I agree with however you use these platforms, maybe I’ll just focus on Speechly and so if you were going to talk to a website, you don’t need some sort of response, like a person talking to you back like, “Here’s your shoes,” but you also want to make sure that the thing’s understanding you. So essentially what we’re able to do is in the same way that you would type in an inquiry or search for a product, you can just speak to a store in natural language, but it’s going to react to your speech in real time.

Collin Borns:

It’s hard to articulate, seeing is believing. But if you could imagine talking to a website, and the second that you say the size, it’s showing the sizes, the second that you say the color, it’s showing those colors, the second that you say the brand, it’s bringing those brands up. And then if you mess up something, it’s not just going to all of a sudden say, “Oh, sorry, what was that?”

Dave Kemp:

It’s really little.

Collin Borns:

And then you’re going to have to start from the beginning. So there’s this ability to speak like an actual human [inaudible 00:20:44] conversation, I know that this is going to be distributed audio only. But you’re giving me head nods, you’re giving me different visual confirmations that you know what I’m saying that you’re understanding that we’re flowing right? That sort of paradigm, or that sort of user interaction is something that with a real-time voice solution with something that can not just put text on the screen, but actually understand what the users are saying that able to just unlock this completely new sort of user interface interaction, that’s voice driven, but not voice only. The best experiences for voice are not voice only they’re these experiences that can handoff between touch, type, and voice for what’s best suited at that point.

Dave Kemp:

Again, I think this is really interesting, because it’s communicating back to you, but it’s just not speaking to you. It’s responding in real time, like you said, and again, I think this is part of the sweet spot is a lot of what you’re doing when you’re shopping is you’re refining your search, “I’m a size nine and a half shoe, I want shoes that are white, I want Nike’s.” So you’re just going down the line of all the different parameters, but being able to, in real time, adjust that by conversating with the computer or your phone, that was the vision I was sold on initially.

Dave Kemp:

And so I agree with you where this seems to be in and I’m not pointing fingers at any one person or any company or anything like that. I just think it’s like, the natural evolution of a new technology is you end up getting lost in the weeds at certain times, because you get excited and then the minutia, like just kind of captures you. And so I agree where it’s, I think that we made things too complex when I think a lot of the “killer use cases” was just this ability to like conversate with your technology, not necessarily mean that you have to like have a actual conversation.

Collin Borns:

No, I think you’re hitting the nail on the head. Yeah, I agree with you.

Dave Kemp:

Okay, so eCommerce, that’s obviously a big one. Let’s go another direction with this, we can go into mobile apps or something like that too. But I’m just curious, continuing this thread of discussion. Where else can you, whether it’s Speechly or kind of like what you’re all driving toward? What are other examples where you see this fitting in well?

Collin Borns:

Actually, I just had a thought with that last piece before that I want to touch on. Again, I’m saying like command and control is where I believe we’re at today. I don’t think that or I do think that in the years ahead, who knows, I can’t put on my Nostradamus hat and predict the future. But I do think we see a future where there’s more conversational experiences, but you have to plant the seed, you have to walk users to that point, just go to any of these voice industry events, and ask people what they use their voice systems for, and we’re talking about people that are at the core of addicted to this stuff and I’m telling you, it’s all command and control.

Collin Borns:

So if the early evangelists, the diehard users of this are still using it for the simple command and control, why wouldn’t we just apply this to other aspects of our business to plant the seed, and then have our users take us where this goes, I think there is a place for these conversations, especially if you plant and train a user on how they’re supposed to interact with like different tasks over time. If you’re so comfortable with something that could become conversational, but just to make this jump and skip this step I just don’t see that.

Collin Borns:

And you mentioned like, where else do we see opportunities? I think we need to look at again the voice industry, I want to take a step back, why would Amazon build voice assistance through a smart speaker and put all this attention on far field voice. They don’t have a good phone so that doesn’t mean that the innovation around the core voice technologies stopped in websites or nearfield voice, that stuff’s been going on the same train that has been going the whole time.

Collin Borns:

So there’s one aspect there. But I also think about what use case would be like the top priority at an Amazon. The innovators dilemma, they’re not going to go and build out something necessarily that they don’t see an eventual ROI in, it’s of logistics, and now AWS is its own beast, but it’s a eComm business, right? And so it makes a lot of sense, like you said, I think Alexa is or you said, I think Alexa is ultimately going to be this vessel, for more eCommerce purchases to be made. And it’s like of course, that would make the most sense, that’s where our long term ROI could be justified.

Collin Borns:

So I’ve been thinking about a lot of these use cases that, again, command and control outside like completely on the other side of the spectrum of where Amazon would focus, because there isn’t the ROI case there today. So I think of like voice enabled warehousing, or logistics, some of these different aspects where you have different inventories being controlled. Just any of these sort of like professional use cases where there’s a lot of data input and information being tracked.

Collin Borns:

I think that, that is an area where we’re going to see a lot of adoption, pick up because, again, at least from my perspective, I think Amazon, it makes sense for them to focus on eComm. And like I said, we have our own interest in eCommerce as well. But I’m paying a lot of attention to these sort of verticals or other industries, where it wouldn’t make sense for an Amazon to go and spend their time. Again, I think I’m coming back more to the actual command and control sort of experience, but I think it can be applied across horizontally across industries.

Dave Kemp:

Yeah there’s a few things going through my head right now because I agree with you, I think that there’s always sort of like an ulterior motive with, Amazon is going to prioritize some things, and deprioritize other things, whether it be the actual use cases, or if I say, “Hey, Alexa, buy me paper towels,” I’m going to probably imagine that it’s either going to be a brand partnership with Bounty, PNG, or it’s going to be their Amazon choice. So there is a little bit of an ulterior motive there. Same thing with Google, they’re going to obviously really try to push their own proprietary services. And I think that Google Assistant is very much sort of the natural successor of the UI for all of the Android devices, and especially with Google’s own hardware.

Dave Kemp:

And so I think that this is a really interesting time, because where we’re at now is it’s the heavy hitters that have the deep pockets had this really big head start. But what’s exciting is that you are really kind of seeing this democratization of the custom assistance, the white label versions of these things, a lot of very domain specific enterprise applications, we’ve both talked to Bruce Rasa before over at AgVoice and it’s like voice on the firm. These things make a ton of sense, when these things really do get built and deployed, they’re going to be massively successful, because they’re solving really challenging problems that exist today.

Dave Kemp:

And I think what’s going through my head right now as we’re talking is that it’s again, kind of the same pattern of like, this democratization theme. And I’ve had a few different conversations on the podcast before, that are sort of along the same vein, which is what’s going to be the “Google moment” for the voice industry. And what I mean by that is if you think about the internet pre-Google, it didn’t make a ton of sense to have a website, and then suddenly, overnight, it made a ton of sense, because you were searchable.

Dave Kemp:

And so suddenly, you had this mass prerogative of every business out there to have at least like you would have the same information that would be in the Yellow Pages. And I think it’s… When does the point in time come for when any business it makes sense to have a voice application and everything that you’re saying this command and control, that might be the moment where if this becomes easily enabled onto any website, like I’m thinking about my business at Oak Tree products. Just being able to search, we have a catalog of 4000 products. And there’s a lot of like variations of things, so you have one parent product, and then you have like 10 sizes in different little obscure variations.

Dave Kemp:

So again, if it were really easy to search our website through a voice plugin, if you will, those are the kinds of things that I think will change the whole landscape on this because for a long time, I feel like as two people that have been in this voice industry for a while and kind of in this network, we’ve heard a lot about you need to have a voice strategy, you need to really be thinking hard about what voice entails.

Dave Kemp:

And I think that makes sense for enterprise size companies that have budgets that can allocate to this. But when you think about like SMBs, there’s not a lot of really concrete ways that you can set yourself up well, like there’s some best practices to make sure that you appear on voice searches and stuff like that. But these are those catalysts moments I think, where if once that type of technology proliferates to where you have something like Speechly that’s an API plugin that you can just feed into all these different websites. And suddenly you turn on that functionality for any and everybody for whatever kind of website or business they’re in. That’s that game changing moment, I think that is maybe going to be akin to what I was just referring to is like the Google moment.

Collin Borns:

I’d say that, maybe when businesses and again, this is around the fact that the Amazon’s and the Google’s control the mindshare of what voice is and voice is a conversational experience on this new digital channel. That’s how the majority of businesses will approach voice today. What can I do with this new digital channel? And then now we’ve started to see this breakout into, like you said, these independent voice assistants, which I think is awesome. And I think, again that future that I pointed to, I think that there’s a big yin and yang between this command and control in some of these other areas where you get to this point, and then there might be a conversation.

Collin Borns:

But to have it be a conversation a whole way through? I don’t know. And so I think that there needs to be and I don’t even know if it’d be a wake up. It might just be that there needs to be time, it might just take some time for businesses to be like, “This is a digital channel but what else can we do?” This core technology that powers this is now… The access to this has been democratized. I can do with it what I want. Let’s start playing around. And so another thing that I’ve been thinking about a lot is what is Big Tech’s intent with their voice assistance, or like these AI projects. At the end of the day, it’s like, pretty clearly stated at a Google or an Amazon like they want to solve generalized artificial intelligence, it’s not like that’s a secret.

Collin Borns:

So how can you put third party business goals at the front of your priority, if you’re trying to tackle a task like that? So for people that want to participate in building out that future and being a part of a digital channel, like that’s fine, I guess I can possibly see that being worth time. But I just think for businesses that are able to see that, wow there’s a lot of value add and utility that can come from this core technology, if we just approach it differently. I think that’s the shift that if it happens, will lead to, like you said the tide sort of being open or the Google moment.

Collin Borns:

But yeah, we’ll see if that happens, I think there’s some interesting, like I said attention that has been brought, but there’s a lot of mindshare that is controlled from the sort of conversational angle things.

Dave Kemp:

No, I agree with you. I think that, again, time will tell. But I do think that we’re at this interesting moment now where it’s been around long enough to where I think there’s been a lot of the infrastructure opportunities have percolated long enough to where I think they are starting to really kind of blossom. And obviously with your company, I think Speechly is a really good example of this, where again, it’s this idea where, if it’s as simple as working with you all to plug in an API, and you have this API that’s just killer because you have like you said, you have the the ASR and the NLU simultaneously being processed at once. What does that translate to?

Dave Kemp:

Well, maybe it translates to just like a much better experience in terms of search, in terms of just like those refined experiences that a lot of people are using on websites to begin with. It feels like just a really powerful way to enhance your existing patient experience, rather than trying to just create some new thing from scratch, which I think is like really daunting for a lot of people out there.

Collin Borns:

Yeah, and that’s an interesting point. There’s not a lot of budget that might be associated to a brand new voice app. However, I don’t think there needs to be, you can integrate, you’ve put all this effort into bringing your users to your own owned assets, whether that’s a mobile app, whether that’s website. Why would you want to push them somewhere else? Once you’ve put in all those dollars and effort to get them there is a good question. And so yeah, I think just again, it comes back to this thinking of what can I do internally that is just again, I think it’s a lot around the education, having conversations like this for people just to challenge their own assumptions and thinking to be like, “What can I do?”

Collin Borns:

And I think that, again, I think this idea of full access to kind of drive it where you want is interesting. And planting the seed with your own users, getting a different level of understanding of what your users are doing, we’re not a data company, we don’t care about data, the user’s data, we just care about making the best interface with voice.

Collin Borns:

So it puts the business’s actual interests at the front and I think when you… I just like this idea of, it’s not hard to understand command and control. We have air pods that are proliferating, we have the voice assistants that are proliferating, everybody knows Siri, we know voice command and control, simple input. Plant that seed with your own users, the opportunity to see where that goes, just seems… If you could tell someone like you’re literally going to have your user tell you exactly what they want to do by putting this little button on your screen.

Collin Borns:

What can that unlock, that’s super interesting. But I would rather own that sort of user journey or that user experience journey, plant the seed early on with that command and control and see where it takes you, we’re not going to be able to predict where this stuff is going to go. If you look at GPS, market outlook, or the… I’m blanking out on the term, but the predicted value that it was going to bring in when it was first announced. This is going to be like a $200 billion market and now what is it, billions of dollars a day that’s generated from GPS. It’s so hard to once this stuff gets out of the box to see where it’s gonna go.

Collin Borns:

So I like this idea of again, I know I’m saying things, but it’s just driving home this point of what is voice get out today? Get that into the hands of your users in your own domains as soon as possible, and then iterate, have an actual customer driven iteration process and see where it takes you. I don’t know how anybody who hasn’t… If you have interest in voice technology, like that should get you excited. There’s a lot of people that like to pontificate and talk about this space, but there’s also a lot of people that are actually building and have hit the walls with what some of the existing or legacy technologies can enable. So getting that full access should I think be exciting?

Dave Kemp:

Yeah, I agree. And here we are, I still remember the first one of these voice conferences that I went to back in, I think it was January of 2018, back when Project Voice was the Alexa conference, all of the focus then was around Farfield features and building for smart speakers because at the time, that’s what existed. I don’t even know if there might have been the first version of the echo show had just been released, but multimodal wasn’t really much of a theme quite yet.

Dave Kemp:

And the point is, is that I think that it’s going to be a progression, I think that we can kind of like simultaneously hold the thought in our mind that voice is going to ultimately be sort of its own independent ecosystem and I don’t know what that will look like, I don’t know if it’s going to be the duopoly of Amazon and Google and those ecosystems. But I think that the way that it is now is that yes, it’s sort of this percolating in its infancy ecosystem, but it’s also a feature. And I think that the feature is ready and I think that it’s ready to be layered on to the web, it’s ready to be layered on to these mobile applications. So as we come full circle here, is that how you’re thinking about this, too?

Collin Borns:

Yeah. I think how I like to look at maybe like the voice space, is you have these general assistants that were started with Siri, brought in all this attention to voice, early 2010-ish. And then it kind of dies down, and then you have Alexa come in and it’s reinvigorated this whole conversation. So there’s all this attention on voice. So what can I do with it, and more and more people are starting to get interested.

Collin Borns:

And I think this has led to people starting to realize different constraints with the existing sort of voice assistant platforms, or some of the endpoints like with smart speakers and so I think that’s led to this natural progression of I want this voice assistant thing. But I want it in my own domain. So now you’re seeing these independent voice assistants still voice assistance, though. So you’re seeing your [inaudible 00:41:21] of the world, and mobile and web, which I think is a very positive step in the right direction.

Collin Borns:

But like you said, about this voice feature being ready, I would argue there’s this third bucket of using just voice as an interface in the existing domains, where we know that the voice control feature is very, very beneficial for creating an efficient user experience. So I think, just approaching voice from that sort of three-pronged approach and being able to look at your own self in your own company and better assess where is the best opportunity for me, rather than going from the angle of general voice assistant, independent voice assistant, and then voice as an interface as like maybe you’re sort of steps of assessing where my best opportunity for voice is.

Collin Borns:

I’d say the best opportunity is actually to flip that script, so start with voice command and control, voice as an interface in your existing domains, then maybe you can explore these independent voice assistance or more general voice platforms, like your Google Assistance or your Alexa [inaudible 00:42:31] and even think about the opportunity that, that gives you. I think I might have mentioned this earlier but being able to control and have access to this, at the end of the day voice as an interface is a completely new user behavior with our technology.

Collin Borns:

So there’s some learning that’s going to come with that to understand how your users are actually engaging with your brand or your experience. And I feel like that education for a company or a brand is invaluable… It’s just so important to understand that when you’re building a brand new behavior, and I think doing it, that sort of step of looking at voice as an interface first in your own domains, is just a really good first step to be able to really understand your customer.

Dave Kemp:

I absolutely love that. I think that you’re pretty spot on there. Because I like that progression too of start it as kind of like the voice user interface and think there as… And you said something earlier in the conversation about it being kind of like a button, like you create a button and then it’s like, who knows what’s going to happen with that. And I actually saw not long ago, there was an article on voicebot. And it was talking about Bank of America’s voice assistant. And what’s interesting about that is it saw a huge uptake in the pandemic, it’s been around for like three years.

Dave Kemp:

And yes, a large part of that is because a lot of people are communicating with their bank online now as opposed to in person. But again, it kind of speaks to the same thing, which is they put that functionality out there, they think it’s Erica. And so you make this thing available. And in time, they probably over those three years, they kept iterating and kept iterating and now it’s pretty legit. And it seems to be this thing where it’s like, the users probably drove a lot of the innovation, they probably saw from all of the usage data, that these are the different things that are really resonating, and then they’re able to like kind of position those things.

Dave Kemp:

And I think that’s a microcosm of everything that we’ve been talking about today, which is, so long as you provide that sort of starting point to let your customers guide you, that I think enables you to get a better sense of what aspects of my business makes sense in this particular capacity. Rather than it try to be this like, I’m going to take every single thing and throw it at it and see what sticks. I think that… I love this point that you made where it’s the inverse of start very, very basic and make it something where you can engage with the brand or the company or whatever it is online through your voice. It doesn’t necessarily have to be conversational, as much as it’s just able to receive that input. And it has some type of output whether or not that’s voice, I think is maybe a little moot right now.

Collin Borns:

It’s like walk before you run. Everybody that builds conversation, any sort of conversational designer will tell you just how hard it is to build a quality conversational experience. And, again, I know I brought up earlier how the top five use cases are all these on these voice assistants, smart speakers are all these first party experiences and that’s a proxy for like, this is just really hard to do. But I think part of that has to do with the fact that the average everyday consumer is just not prepared or ready for these fully immersive conversational experiences.

Collin Borns:

There’s a lot of baggage that comes with a conversational experience, I think for the user that needs to be perfect, it needs to understand. And by eliminating that sort of element or that sort of factor from your experience, and being able to again, just use this voice as an interface feature, and see where that takes you, I feel like it’s a lot more compelling. And that can lead to a conversational experience. There can be points where it makes sense to have some sort of verbal response.

Collin Borns:

But again, I just don’t think using that as your starting point is necessarily the best strategy when looking at this world of voice technology, where we’re still very much in like, I don’t even know if we can say we’re in the first inning, I think we’re putting down the chalk to start the game.

Dave Kemp:

I agree. No, this has been such a good conversation. I always love hearing the way that you’re thinking about things. And that was the perfect sort of roundabout full circle way I think that we’ve come to this is like, we’ve been in this long enough to where I think just as you said, Google and Amazon, because of their sort of ulterior motives set the… They sort of established the way in which I think we think about these things.

Dave Kemp:

And I think that over the course of the last few years, people have been, especially the people that have been really operating in this space. I think you’re starting to have a lot of people to start to challenge that and say, “Is this really the right way for us to be introducing this to the masses. And to your point with smart speakers, I think that the vast majority of people use the same five things. And that’s great, because you’re still building that habit and that’s not to say that the conversational ecosystem of experiences will not ever flourish and blossom.

Dave Kemp:

I think it totally will. But I think that we’re going to need some breakthroughs along the way. And I think again, it comes back to, so what do you do right now, and this is where it comes back to start small. And the voice user interface, in my opinion is the area that just about anybody can deploy in some capacity, just like you said, it might not even be consumer facing, might be an internal operation that you have. There’s lots and lots of different types of workers out there that could benefit from maybe an efficiency in just the…

Dave Kemp:

I saw there was a report about Henry Schein, which is the big, it’s like the oak tree products of the dental world. And they’re a lot bigger, they’re a publicly traded company, but they introduced like this dental procedure that you can sort of transcribe, you’re calling out which tooth has the cavity, and typically, you would have somebody in the room with you that’s writing that down.

Dave Kemp:

There’s just lots and lots of little… That’s really [inaudible 00:48:49] it’s just transcription. There’s lots and lots of little things like that, that are starting to present themselves and I think that these are going to be kind of the introductory things that ultimately will probably lead to the conversational explosion of experiences that will come in time but I’m not sure if it’s quite there for like primetime for everybody quite yet.

Collin Borns:

I think we just need to look where we can plant the seed. Plant the seed and let the tree grow.

Dave Kemp:

Love it. Awesome, Collin. Well, before we get going, can you share with everybody where they can connect with you, follow you, checkout Speechly, all that?

Collin Borns:

Yeah, first and foremost, obviously the Speechly websites are just speechly.com or you can reach out to me directly at Collin, that’s with two L’s C-O-L-L-I-N @speechly.com. And then always happy to engage on the Twitter @Collinborns B-O-R-N-S. That’s where to reach out.

Dave Kemp:

On the Twitter.

Collin Borns:

The Twitter.

Dave Kemp:

Awesome man, great catching up with you, man. I missed making content with you. Looking forward to the not too distant future where we can get together and have some beers and hang out. So thanks everybody who tuned in here to the end and we will chat with you next time. Cheers.

Dave Kemp:

Thanks for tuning in today. I hope you enjoyed this episode of Future Ear Radio. For more content like this just head over to futureear.co where you can read all the articles that I’ve been writing these past few years on the worlds of voice technology and hearables and how the two are beginning to intersect. Thanks for tuning in, and I’ll chat with you next time.

Leave a Reply