22 June 2010

Coming soon: Google Voice Search

By Ken Fisher

The master of text-based search is looking to lend a voice to Internet users everywhere, or so it appears based on Google's latest patent. Patent #7,027,987 issued today by the US Patent and Trademark Office covers a "Voice interface for a search engine," which is described as:

"A system provides search results from a voice search query. The system receives a voice search query from a user, derives one or more recognition hypotheses, each being associated with a weight, from the voice search query, and constructs a weighted boolean query using the recognition hypotheses. The system then provides the weighted boolean query to a search system and provides the results of the search system to a user."

Translation: the system listens to your spoken query, does its magic, and returns the results.

Google has not recently commented on this voice search effort, although the company's Alexander Franz did co-author a an article on the topic back in 2002 (PDF). Nevertheless, it is clear that this service would be ideal for users of Google's mobile search. In fact, voice recognition could possibly power Google's mobile search right into competition with local 411 services.

And while those 411 services and other voice-to-text providers are working on their own voice-powered systems, Google's looks to leapfrog the competition by attempting to support a wide-ranging voice vocabulary. According to the patent itself, existing solutions often require multiple steps to make voice queries manageable, at times foisting limited vocabulary support onto users. A system may, for instance, require the user to respond to specific voice queries with a limited set of options pre-determined by the system.

"Current speech recognition technology has high word error rates for large vocabulary sizes. There is very little repetition in queries, providing little information that could be used to guide the speech recognizer. In other speech recognition applications, the recognizer can use context, such as a dialogue history, to set up certain expectations and guide the recognition. Voice search queries lack such context. Voice queries can be very short (on the order of only a few words or single word), so there is very little information in the utterance itself upon which to make a voice recognition determination."

Google's system is aimed at making the voice-based search process more like a standard text-based search query, where the search engine itself attempts to provide the most relevant results with as little interaction with the end user as possible. They key to this the weighted approach. By using an algorithm to weight reconstructions of user's queries, the system looks to tap into the Google search system in order to increase the accuracy of their voice recognition system.

Can it handle Massachusetts accents, though? That would be wicked smaht.

No comments: