The thing behind the voice control

All voice-control devices are driven by a software agent called a “virtual assistant” or “voice assistant.” When we speak certain words a voice assistant can handle different kinds of tasks for us.

A good example is Siri. And although we often made fun of Siri for asking all kinds of amusing questions in the early years of its development, it is now capable of relatively sophisticated interaction. For example, with HomeKit we can command Siri to turn on the light in our living room, or switch TV channel. 

How do they do it? The secret is the huge data center behind the internet; it analyzes our words and gives us a quick response, even if that response is not always correct!

Reasons to use a voice assistant

From the Stone Age, humans have been obsessed with one thing: how to make our lives easier. Even though we have so many effort-saving tools, we still strive for perfection – wanting tools that will save more of our precious time and money, or satisfy our wants and needs.

Voice assistants seem to represent a smart, fast, and easy way to perform everyday activities, to help people achieve a better quality of personal life. So, the real question is what kind of convenience can voice assistants provide for us? Using simple speech, we don’t need to use buttons, dials and switches for complex operations. We simply “ask” the voice-control device to do its job, while we have a “free hand” to do other things.

Nowadays, voice assistants have become more and more personal. They can recognize our voice and record our behavior. Technologies like AI are making these devices better at fitting our personal lifestyle. They complete routine work or provide information – fast!

The tasks most commonly requested are shown in the data below.

 Data from a survey by PwC. Only 10% of surveyed respondents were not familiar with voice-enabled products and devices. Of the 90% who were, the majority (72%) have used a voice assistant.

Unresolved problems for voice assistants

There’s no denying that voice is the future, but the technology still needs time to address several major concerns expressed by users. 

Awareness and Complexity

The capabilities of voice assistants are growing fast, but people have to be aware of what they are able to do with voice-enable devices in the first place. There is a lack of information about the limitations of these devices.

Trust

For more serious situations involving money (shopping, refund on an airline ticket, etc.), consumers prefer the traditional ways they already know and trust—at least for now.

The Cost

If people already have devices that work just fine, why should they spend hundreds of dollars on something that may make no big difference?

Privacy

For obvious reasons, people don’t want to use voice assistants in public, particularly for personal matters. Talking to one’s phone still looks a bit weird, so most people feel more comfortable using these innovations at home.

Most popular voice assistants

An online poll in May 2017 found the most widely used voice assistants in the US were Apple’s Siri (34%), Google Assistant (19%), Amazon’s Alexa (6%), and Microsoft’s Cortana (4%)

 

 Apple Siri

Siri was the first modern digital voice assistant installed on a smartphone. It was introduced as a feature of the iPhone 4S on October 4, 2011. And with HomeKit – released with iOS 8 in September 2014 – it can be used on all HomeKit-enabled devices.

The original American voice of Siri was provided by Susan Bennett in July 2005, unaware that it would eventually be used for the voice assistant. With iOS 11, Apple auditioned “hundreds of candidates” to find a new female voice, then recorded hours of speech, including different personalities and expressions, and built a new text-to-speech voice based on deep-learning technology.

Shortcuts is a new capability in iOS 12, with an associated Shortcuts app that allows iPhone and Apple Watch users to use Siri to progress through multistep routines. Shortcuts replaces the previous Workflow app that Apple acquired 2017 and is designed to allow you to create custom commands in Siri that launch apps or combine several actions in a similar way to IFTTT (If This Then That).

Siri’s feature

Apple offers a wide range of voice commands to interact with Siri, including, but not limited to:

  • Phone and text actions, such as “Call Sarah”, “Read my new messages”, “Set the timer for ten minutes”, and “Send email to mom”
  • Check basic information, including “What’s the weather like today?” and “How many dollars are in a euro?”
  • Schedule events and reminders, including “Schedule a meeting” and “Remind me to”
  • Handle device settings, such as “Take a picture”, “Turn on Wi-Fi”, and “Increase the brightness”
  • Search the Internet, including “Define…”, “Find pictures of…”, and “Search Twitter for…”
  • Navigation, including “Take me home”, “What’s traffic like on the way home?”, and “Find driving directions to…”
  • Entertainment, such as “What basketball games are on today?”, “What are some movies playing near me?”, and “What’s the synopsis of…?”
  • Engage with iOS-integrated apps, including “Pause Apple Music” and “Like this song”

 

Google Assistant

blank

 Similar to Siri, Google Assistant was released in May 2016 as part of Google’s messaging app Allo and its voice-activated speaker Google Home. After a period of exclusivity on the Pixel and Pixel XL smartphones, its deployment spread to other Android devices in February 2017, including third-party smartphones and Android Wear (now Wear OS). It was released as a standalone app on the iOS operating system in May 2017.

Now Google Assistant has extended capability to support a large variety of devices, including cars and smart home appliances. In January 2018 at the Consumer Electronics Show, the first assistant-powered “smart displays” were released. Smart displays were shown at the event from Lenovo, Sony, JBL and LG. These devices have support for Google Duo video calls, YouTube videos, Google Maps directions, a Google Calendar agenda, viewing of smart camera footage, in addition to services which work with Google Home devices.

On October 9, 2018, Google unveiled Google Home Hub, which features a 7-inch touchscreen display that can be used to provide visual feedback for queries.

Google assistant’s feature

It supports hands-free calling, letting users make calls to any landline or mobile phone in the United States and Canada for free. Google Voice users can set Google Home with voice numbers to make personal and business calls. There is no 9-1-1 emergency services support, however. “Proactive Assistance” enables the device to dictate updates to users without being asked, including updates on traffic before a scheduled event. “Visual Responses” let users send answers from Google Home onto their mobile device or Chromecast-enabled television. The device now also supports Bluetooth audio streaming through compatible devices (including phones, tablets and computers). It also has the ability to schedule calendar appointments, with upcoming support for reminders.

 

Amazon Alexa

blank

Amazon announced Alexa alongside the Echo in November 2014. Alexa was inspired by the computer voice and conversational system on board the Starship Enterprise in science fiction TV series and movies, beginning with Star Trek: The Original Series and Star Trek: The Next Generation. The name Alexa was chosen because the X has a hard consonant sound and therefore could be recognized with higher precision. The name is also claimed to be reminiscent of the Library of Alexandria, which is also used by Amazon Alexa Internet for the same reason.

In June 2015, Amazon announced Alexa Fund, a program that would invest in companies making voice-control skills and technologies. The US$100 million fund has invested in companies including Ecobee, Orange Chef, Scout Alarm, Garageio, Toymail, MARA, and Mojio.

In 2016, the Alexa Prize was announced to advance the technology.

In May 2018, Amazon announced that Alexa will be included in all 35,000 new Lennar Corporation homes built this year. Alexa is now compatible with 20,000 devices and is used by more than 3,500 brands.

Alexa Skill

Alexa offers weather reports provided by AccuWeather and news provided by TuneIn from a variety of sources including local radio stations, NPR, and ESPN. Additionally, Alexa-supported devices stream music from the owner’s Amazon Music accounts and have built-in support for Pandora and Spotify accounts.

Alexa can play music from streaming services such as Apple Music and Google Play Music from a phone or tablet. Alexa can manage voice-controlled alarms, timers, shopping and to-do lists, and can access Wikipedia articles.

Alexa devices will respond to questions about items in the user’s Google Calendar. Alexa’s question answering ability is partly powered by the Wolfram Language. When questions are asked, Alexa converts sound waves into text which allows it to gather information from various sources. Behind the scenes, the data gathered is then parsed by Wolfram’s technology to generate suitable and accurate answers.

As of November 2016, the Alexa Appstore had over 5,000 functions (“skills”) available for users to download – up from 1,000 functions in June 2016. As for partnership with a fellow technology company, Microsoft’s AI Cortana became available to use on Alexa-enabled devices as of August 2018. Amazon rolled out a new “Brief Mode” in which Alexa would begin responding with a beeping sound rather than saying, “Okay,” to confirm receipt of a command.

Cortana

blank

The name Cortana comes from a synthetic intelligence character in Microsoft’s game Halo, the US-specific voice belongs to Jen Taylor, who is also the same voice actress in Halo.

In January 2015, Microsoft announced the availability of Cortana for Windows 10 desktops and mobile devices as part of merging Windows Phone into the operating system at large. Android- and iOS-version release officially took place in December 2015, but only after an Android APK file containing Cortana had been leaked ahead of release. Microsoft has integrated Cortana into numerous products such as Microsoft Edge and Skype. Cortana can also be used to control the Xbox One as part of a universally designed Windows 10 update for the console.

In December 2016, Microsoft announced the preview of Calendar.help, a service that enabled people to delegate the scheduling of meetings to Cortana. Users interact with Cortana by including her in email conversations. Cortana would then check people’s availability in Outlook Calendar or Google Calendar and work with others cc’d on the email to schedule the meeting. The service relied on automation and human-based computation.

In May 2017, Microsoft in collaboration with Harman Kardon announced INVOKE, a voice-activated speaker featuring Cortana. The premium speaker has a cylindrical design and offers 360-degree sound, the ability to make and receive calls with Skype, and all the other features currently available with Cortana.

On August 15, 2018, Cortana-Alexa integration went into public preview.

Cortana’s feature

Cortana can set reminders, recognize natural voice without the requirement for keyboard input, and answer questions using information from the Bing search engine (e.g., current weather and traffic conditions, sports scores, biographies). Searches using Windows 10 will only be made with the Microsoft Bing search engine and all links will open with Microsoft Edge, except when a screen reader such as Narrator is being used, in which case the links will open in Internet Explorer. Cortana integrates with services such as Foursquare to provide restaurant and local attraction recommendations.

On Feb 16, 2018 Microsoft announced that connected home skills were being added for ecobee, Honeywell Lyric, Honeywell Total Connect Comfort, LIFX, TP-Link Kasa, and Geeni. Support for IFTTT was also added.

Notebook

Cortana stores personal information such as interests, location data, reminders, and contacts in the “Notebook”. It can draw upon and add to this data to learn a user’s specific patterns and behaviors. Users can view and specify what information is collected to allow some control over privacy. The function is said to boast “a level of control that goes beyond comparable assistants”. Users can delete information from the “Notebook”.

Reminders

Cortana has a built-in system of reminders which, for example, can be associated with a specific contact; it will then remind the user when in communication with that contact, possibly at a specific time or when the phone is in a specific location. Originally these reminders were specific to the device Cortana was installed on, but now Windows 10 synchronizes reminders across devices.

Phone notification syncing

On Windows Mobile and Android, Cortana is capable of capturing device notifications and sending them to a Windows 10 device. This allows a computer user to view notifications from their phone in the Windows 10 Action Center.

 

Performance

 Based on research from Stone Temple, the Google Assistant delivers a high degree of answering capability and accuracy. The test performances of Alexa and Cortana are close to that of Google Assistant, but Siri lags somewhat behind. 

Digital Assistant

# correct answers

% correct answers

Google Assistant Smartphone

3639

74.6

Cortana

2944

59.5

Google Assistant Home

2868

58.0

Alexa

2195

44.3

Siri

1616

32.7

  

Three ways to get started

 

Mobile device

 The easiest way to get started is by using an iOS or Android device, including smartphone, tablet, laptop and wearable device. Microsoft looks unlikely to continue developing Windows 10 Mobile anymore. It will focus on providing Cortana service with Android and iOS instead now. 

Voice Assistant

Available device

Siri

Get start

IOS 9 or later

MacOS later than 10.14 Mojave.

Google

Get start

Android 5.0 and up

iOS 10 or later

Alexa

Get start

Fire OS 3.0 or higher

Android 5.0 or higher

iOS 9.0 or later

Cortana

Get start

Windows 10

iOS 9.0 or later

Android 4.4 and up

    

Smart Speakers

 A more direct and simpler approach is buying a smart speaker. Smart speakers are essentially special speakers which can offer more than just sound. They may allow control via voice commands or facilitate control of another device once it is connected. They need a Wi-Fi connection (or full internet connection for remote access), and most of them require an external power supply.

 There are two main differences between mobile devices and smart speakers:

 You need to keep your smartphone or tablet close to you, but a smart speaker has a much longer range to hear your voice.

  1. A smart speaker can act as a central-control device to connect to other voice-enabled devices. Then you can operate them via voice command, even when you are far away from home. 

Apple speaker

 

Name

Description

Price

Apple Homepod

Deep bass,High-fidelity audio,incredible listener and smart home control.

 $349.00

 

Google home speakers

Name

Description

Price

Google Home

Get answers, play songs, tackle your day, enjoy your entertainment and control your smart home

 $129

Google Home mini

Easy to place in anywhere

 $49

Google Home max

Large size with high-quality sound

 $399

Google Home Hub

Use voice and show you on screen

 $149

 

Alexa speakers

Table setup not completed.

Cortana speaker

 

Smart home device

The last option is to use a voice-assistant-built-in device. There are many built-in devices like ecobee4 or Bose Soundbar 500. Both have built-in Alexa, which can be used independent of a mobile device or speaker. Also, you can use them as supplements to extend the voice range. The potential risk is the third-party device may have limited functionality compared to a smart speaker due to the design of the manufacturer.

 First, let’s talk about how to tell the built-in device from an enabled device.

The difference between Built-in and Enabled devices

Built-in device

They are 3rd party devices using voice-assistant technology from another brand. The voice assistant’s service is pre-installed, so you can really do “plug in and play”. They always have microphone and speaker components to implement “hear and speak”. Usually the built-in device is an exclusive product which only works for spec voice assistant.

Enabled device

Unlike the built-in device, the enabled device can’t “hear” or “speak” for you. You need to connect it to your central-control device – like a smart speaker – to get it to work for you. This kind of device is usually labeled as “Works with XXXX” or “XXXX enabled”.

Typical built-in devices

The built-in voice assistant devices essentially offer a fully integrated voice assistant and are endorsed and marketed by well-known brands. With an integrated package, they provide the voice assistant’s full range of services. Users can also expect a high-quality experience in connecting with the device.

Popular enabled devices

 Enabled devices can’t work via voice command if they are not connected to a voice-assistant-built-in device, but they may offer compatibility with different voice assistants.

 Third party manufacturers are increasingly deploying voice assistants in their devices. This has two advantages:

  1.  Devices can perform better with the users;
  2. The manufacturers can get endorsement or certifications from well-known brands which can help them market their product well.

 Bottom line

 Voice assistants still have a long way to go before they will be perfect for daily use. Currently they do not always provide correct responses, although they are getting smarter day by day. Personally, I would like to give one a try – maybe I can learn some useful tricks to make my work easier.

 The performance of a voice assistant is not only determined by its own software, it also depends on the hardware. More specifically, to be really hands-free in your home/office, you need to place enough smart speakers or built-in devices in your home, then replace each traditional device with a voice-enabled device. That is certain to cost a small fortune, even for a small house, so the best way is to start with what you want to improve most, choose the voice-enabled device first, and then consider its compatibility with a voice assistant.

 Siri is an easy choice if you already have a lot of Apple devices. Although its accuracy is not so good, it’s enough for home use.

 Google Assistant may be the best choice today, and Google has a good ecosystem, including a lot of its own and third-party products.

 Like Google, Alexa also has a strong ecosystem and its Echo series provides much wider choice for different scenarios.

 Cortana has just joined in the game. A lot of third-party devices don’t support Cortana yet – although I must admit that using Office via voice is a temptation – so I will not recommend choosing Cortana to create a smart home, there are too few devices in its ecosystem.