What is Voice Recognition?“Voice recognition is the process of taking the spoken word as an input to a computer program,” defines voice recognition Jim Baumann from the University of Washington. As explained by Gary Pearson, Co-Founder of Verbyx, voice recognition relies upon two components to generate the accuracy levels that are reported: a language model and acoustic model. Together, these two models provide an internal representation of how people using a specific language from a specific country or region speak.
Because voice recognition software relies on generalizations and rough approximations, individual variations in accent, tone of voice, pitch, and so on affect its accuracy and reliability. Some users are likely to have close to zero issues with certain voice-controlled devices, while others will fall on the opposite end of the accuracy spectrum.
Disadvantages of Voice ControlAs Carol Finch writes, “Programs cannot understand the context of language the way that humans can, leading to errors that are often due to misinterpretation.” People are surprisingly great when it comes to filling in missing information and subconsciously correcting for speakers’ errors. Homonyms, complex deixis, and even complete omission of entire words or phrases seldom prevent us from understanding one another. While modern AI-powered voice-control systems are much better than the technology from 10 years ago, true natural communication with real-time feedback is still impossible.
With errors also comes the necessity to invest more time to correct them. This can turn a quick Google search into a minute-long order, which isn’t all that bad unless you add up how much extra time it takes you to get things done over a long period of time. “Most of the time it really would be just as easy to press the button for the desired action (or macro of commands) on a button panel or graphical user interface. Saying the voice command, waiting for it to be acknowledged and the command sent is simply slower than pressing a button,” writes Aaron Green.
Another major disadvantage of voice control over graphical user interfaces is background noise interference. For voice control systems to work properly, you need to be in a quiet environment, undisturbed by ambient noise and people talking. Such conditions may not always be possible to achieve, although headphones with noise-cancelling microphones do help to some extent.
Despite the obvious shortcomings of voice control systems, Vlad Sejnoha, chief technology officer of Nuance Communications, a Burlington-based company that dominates the market for speech recognition with its Dragon software, believes that “within a few years, mobile voice interfaces will be much more pervasive and powerful,” according to MIT Technology Review. “I should just be able to talk to it without touching it,” Sejnoha says.