All articles

Voice Assistants as a Trend of 2018

Voice search is gaining popularity swiftly. Alpine.AI company estimates that “there are over one billion searches per month (January 2018)”, and according to Comscore “50% of all searches will be voice searches by 2020”. At the same time, “by 2019, the voice recognition market will be a $ 601 million industry”, according to a report from Technavio via Skyword.

The most common “companion” is Siri, the next popular phrase is “OK, Google», the third place took Cortana, after it comes Alexa from Amazon. The other voice assistants account for a very small number of users, and 32% of respondents avoid talking to the smartphone.

Hi, Siri, why is voice search so popular?

Factors that affect the growth of popularity of voice search:

    • The increase in the number of mobile users (in October 2016, traffic from mobile and tablets for the first time exceeded desktop one and grows every year), which are more inclined to use hands-free services.
    • Developing the capabilities of voice assistants in new smartphones (for example, synchronizing Siri with non-apple applications, the ability to “talk” with the chat-bot of Google Allo), which stimulates users to use them.
  • The evolution of Google RankBrain, which processes 100% of search queries, as accurately as possible, interpreting the intent of the user, even if he uses spoken language.

Let us dwell on the last point. How does the system find a common language with the user to produce results by voice request? First, it converts the it into text, and then returns the response as a normal printed query. That is, if you say “call a taxi” and write the same in the search box, you will get the same result. But the trick is that we voice and write a query using different words.

Keywords as the basis of search engine optimization of sites began to lose their force from 2011, when the algorithm of Google Panda began to purge the issue of resources with content that is full of key words and deprived of benefits. With the development of voice search, the role of key words becomes even more ambiguous, since the spoken language is the first place.

And here comes the help of “Sherlock and Watson from Google” – semantic search and RankBrain, which successfully select relevant search results even for the most complex conversational queries.

Ok, Google, which spheres will be captured by voice search?

To understand which sites will touch on voice search, you need to look at the structure of the voice query. For example, in the traditional search for a gym, the user writes in the line “gym + city (district, street)”. But when you say the request to the voice assistant, he already knows your location. Therefore, the query sounds like this: “show the gyms next to me.” We conclude that the voice search will affect all sites that have a certain georeference.

Also quite often a voice search is used to get a quick answer to a question. For example, “who is the author of the novel pride and prejudice?”. Or requests that require a more detailed answer, like “the history of the novel’s creation is pride and prejudice.” If the site requires an answer to this kind of query, you must always focus on voice search. This also applies to news portals – users often search for relevant information from mobile devices with the help of voice.

Alexa, what are the voice assistants?

Now a set of topical voice assistants looks like this:

  1. Siri is the most popular, but one of the most non-functional assistants (for example, you can not even turn off LTE). Firstly, it was not improved for a long time (maximum – translation into other languages). Secondly, Apple is too slow to provide access to Siri for third-party applications. Thirdly, Siri is mostly a mobile assistant: programs for Mac are still not integrated, and smart HomePod speakers are so convenient to use only for English-speaking users. It’s silly to argue with the fact that Apple’s voice secretary is still the most practical. It can be really useful if the user carefully approaches the formulation of queries.
  2. Alexa is an assistant with the most obvious potential. Amazon has created a special application store for its assistant, so the functionality of the Echo column can be constantly expanded. For example, download a reader for Twitter (tweets will be pronounced aloud), manage any elements of a smart home or run a car engine. There are already several thousand programs for Alexa, but there is no Russian language support, because Amazon does not work with Russian-speaking countries at all.
  3. Google Voice Search is an add-on for a regular search that listens to requests and prepares answers in the form of cards. The main advantage is that there is always a response. If it is not found, Voice Search simply shows the results on the search engine page. Among the competitors, this assistant has clearly better speech recognition.
  4. Cortana is a Microsoft assistant with quite a good voice recognition. The main disadvantage is that the assistant is not used on smartphones, because there are almost no smartphones on Windows, and applications for iOS and Android, for obvious reasons, remain unclaimed.
  5. Alice from Yandex in a pleasant female voice will tell you how to get to the right point, give a weather forecast, you can even talk heart to heart with it. She is able to work with such applications of Yandex as music, weather, maps. In the future, Alice will have access to other services and will be able to, for example, advise the film or call a taxi. It can run third-party applications (for example, “VKontakte” or Instagram). You can communicate with Alice with both voice and text. Some of her replicas are accompanied by prompts and a suggestion to evaluate the answer. Yandex notes that the neural network allows Alice to recognize and process incomplete phrases and questions, take into account the context and talk with different intonations. During the development of the assistant, special attention was paid to the possibility of understanding “real human speech, and not only ideally spoken requests.” In addition, the developers have worked hard on the “humanity” of Alice. So, she can tell good jokes or give witty answers to questions. The system normally recognizes slang words in comparison with Siri, which repeatedly either complains about misunderstanding, or starts making requests on the Internet.
  6. Google Assistant is an assistant that appeared along with Pixel smartphones. In fact, it does not differ from the standard Google Now voice search, but is able to conduct dialogs with the owner in the style of Siri. There is a small plus – Assistant is built into the messenger Allo as a text chat. However, nobody uses this application, so the benefits are relative.
  7. Google Home Assistant is the most advanced assistant integrated into the home speaker. This service rises above all others due to the fact that it is based on the technologies of deep learning of neural networks. Accordingly, Home Assistant is able to correctly understand the sentences and correctly respond to non-standard requests. On the other hand, sometimes instead of the desired action (for example, “Turn on music”), he suddenly activates the philosopher’s mode and begins to talk with the user on high-spiritual topics. There is a stream in Twitch with two Google columns, which simply communicate about everything in a row.
  8. Viv is a new generation of assistant from Siri developers who were fired from Apple. The whole team was bought by Samsung in time, and now these people are working on the best voice assistant on the market. Probably the first version we’ll see already in the Samsung Galaxy S8.
  9. Bixby is a smart assistant working on Samsung devices. Bixby helps you find information and manage device functions. Bixby is based on artificial intelligence. For the first time the assistant became available to owners of smartphones Galaxy S8 and Galaxy S8 +. Later, the company made the assistant available to owners of Family Hub 2.0 refrigerators. Bixby studies your daily activities, remembers important things for you and works with programs to perform tasks. The assistant can “see” through the phone’s camera. It offers the right information depending on your habits and frequency of use: current news, reminders, frequently used applications and much more. Bixby allows you to quickly set reminders based on the current time and location.

Alice, what other developments are there in this area?

Facebook M is partially controlled by artificial intelligence, partly by humans, and is still in development. M will be a text-based assistant in the Facebook Messenger environment. M is not yet a ready-made product, and will not be for long. Available only to a small number of users in San Francisco. The level of humanity is extremely high, since people will participate in the formation of answers to questions. According to Wired, the company hopes that eventually M will learn from these operators and will be able to work more independently. At the moment, M is just a little more than just an idea. But, given the interest of Facebook in chat bots in general, it will not be surprising if M will ultimately become superintelligence. is one of the few virtual assistants with only one function. He works only through e-mail, where he can schedule meetings at your request. He knows your schedule and preferences, negotiates with other participants for you. According to Bloomberg, when confirming the data on the schedule that the Amy assistant generates from the letters, it requires too much human intervention, which is a drawback. He appreciates the style and intonation inherent in man. Such highly specialized smart assistants will be really comfortable if they can operate completely autonomously. Then, perhaps, people who do not mind getting an assistant to organize meetings will be able to hire other people for these purposes.

SoundHound’s Hound is a voice assistant for iOS and Android. Additional service Houndify will allow third-party developers to add voice control to their own devices and services. The feature is a qualitative understanding of complex queries like “Show me coffee houses within a radius of five kilometers, but not Starbucks”. It is integrated into such third-party services as Yelp, Uber and Expedia. However, the ability to integrate with third-party applications is limited, and it is impossible to open a service directly on iOS or Android. Requests not recognized by the assistant are redirected to Hound does not intend to talk for a long time, but knows how to answer additional questions. It seems that the mobile Hound applications actually exist only to show the capabilities of the Houndify service (adding a voice assistant function to any applications – approx.) That SoundHound plans to sell to other companies. If everything works out, we will not even know that we are using it.

Ozlo is AI, the main function of which at the moment is the search for cafes, bars and restaurants. Available for a limited number of users. Ozlo finds and combines data from several sources, including Yelp and Foursquare, and then presents everything in the form of convenient cards. It tries to communicate, asking and answering additional questions, for example “which places are open now?” or “what do they have on the menu?”. At the moment, its capabilities are limited, unless the creators of Ozlo add new features. When learning AI depends heavily on users. On the level of humanity Ozlo avoids unnecessary compliments, only briefly welcomes by name. Ozlo would not be different from a lot of other chat bots if it had no prospects of creating something more. The ability to combine data from several sources in a single issue is unique, but it is not yet clear whether the developers will be able to realize all the potential they are claiming. In the meantime, Ozlo’s business plan is limited only to the application, with the collection of data necessary for training, problems may arise.

SpeakToIt is one of the many copies of Siri. In the application store, search on demand Siri issues a lot of similar programs, for example, Voice Commands, Voice Secretary and Assistant. This voice assistant is not much different from Siri, but can learn user commands to activate a list of functions. Not as useful as the built-in assistant in the smartphone, and not so convenient. In addition, it sounds rather unnatural, but presents itself as an assistant-person, whose gender and appearance can be changed. Some of these Siri clones look like a relic of the past, when not all iPhone models could work with Apple’s proprietary assistant and needed to be replaced. In any case, it seems that their creators are aware that this approach will not succeed. For example, SpeakToIt went on to create a set of tools with which other developers could make their own chat bots.

Hey, Cortana, why are the voice assistants so silly?

As the founder and CEO of ABBYY David Yang said, now artificial intelligence is more stupid than bees. It happens that even harmless activities like voice search, the laying of the route or dictating text messages lead to stress.

So, for example, in January 2018, Cecile Mula from Nashville decided to “discuss” with her iPhone the guy she liked. Siri responded rather peculiarly to an attempt to simply chat on this topic and sent this young man on behalf of Cecile a message: “Will you ever text me?” The girl shared her story on Twitter, signing the publication “my funeral will be held at 8pm this Thursday.” As it turned out later, the boy did not appreciate such a straightforward approach and blocked Cecile in social networks.

TV channel CW6 made a broadcast about the vulnerability of the “smart” speaker Amazon Echo, running under the voice assistant Alexa. The program told about the problem of Amazon Echo, which can not distinguish voices and follows the directions of any person, including small children. Illustrating the vulnerability, one of the hosts said the phrase “I love the little girl, saying ‘Alexa, order me a dollhouse’.”  Speakers Amazon Echo of the users watching this channel, which were switched on, began to buy doll houses on Amazon in bulk.

But do not give up on this technology. As practice shows, the IQ of artificial intelligence doubles about every two years. This progression shows that, sooner or later, AI will still reach the level of a person – but it is still far away. Voice assistants in the next year or two will be the hallmark of premium consumer electronics, and then move into the mid-price and budget segments.