Development
Speech recognition has been hailed as the next big thing for almost a decade now, and the technology has finally got to the point where it can comfortably keep up with conversational speeds - 120 words per minute (wpm) if you're American, 160 if you're British. By comparison, it takes a typing speed at least 40 wpm to get a secretarial job, and most untrained typists struggle along at 15 to 20 wpm. The latest software can be trained by the user in only a few minutes to an accuracy of 92 percent. (That assumes you have a late-model PC, plus a fancy digital microphone - sources say you are unlikely to get above 90 percent accuracy with one of the cheap microphone headsets that plug directly into a PC's audio card.) During subsequent use, the user and the software adjust to each other so that the accuracy can eventually increase to about 98 percent - if the user has the patience to push through during that period. Sources say that as many as a third do not.
Software that can follow continuous conversational speech dates to 1997 when Dragon Systems introduced its NaturallySpeaking package. Its main competitors were IBM, Kurzweil Applied Intelligence, and the Belgian firm of the Lernout & Hauspie. Lernout & Hauspie routinely reported healthy sales growth while all its competitors saw only a flat market. Its apparent disproportionate success caused the firm's stock price to rise higher and higher, and it was able to buy Kurzweil, and later, in 2000, Dragon Systems. It also bought Dictaphone Inc. at about the same time. But the latter two purchases exposed Lernout & Hauspie to American financial reporting standards, and the company collapsed a few months later in an enormous accounting scandal. Its accumulated speech recognition technology was later sold to ScanSoft Inc. Peabody, a company involved in scanner software.
Not a lot of choice
So by the start of 2003 there were only three speech recognition choices for the PC on the mass market: Dragon NaturallySpeaking now sold by ScanSoft; ViaVoice from IBM, and Microsoft Word XP, which includes some speech recognition facilities. Experts typically scoffed at Word XP, with its limited features. Meanwhile, they feared IBM would have no reason to enhance ViaVoice if its chief competitor was a forgotten stepchild product. So they were thrilled when ScanSoft came out with the new version of Dragon NaturallySpeaking earlier this year. IBM soon followed with the new revision of ViaVoice.
Then came the announcement from IBM - it was selling the republishing rights for ViaVoice to ScanSoft. 'We will be reselling ViaVoice,' said Robert Weideman, chief marketing officer at ScanSoft. 'We will own the packaging, marketing, distribution and support for the product worldwide. IBM will still own the right to develop ViaVoice for other products. It's not accurate to say we own ViaVoice - this is a seven-year agreement.' Dragon NaturallySpeaking will remain ScanSoft's premiere speech recognition product, he indicated. But ScanSoft will sell ViaVoice to markets not served by Dragon NaturallySpeaking, especially the Macintosh market, and to certain international markets where ViaVoice was especially strong, such as Germany and Asia. Also, ViaVoice had stripped-down versions starting as cheap as about $30, and Weideman indicated that ScanSoft might begin pushing them through the mass merchandise channel.
But IBM stays
Meanwhile, Brian Garr, IBM's manager for voice and translation software, indicated that IBM is in no way leaving the speech recognition market. However, it is focussing on server-based products such as are used by corporate telephone call centers, and also on embedded devices, including ones used in luxury cars. IBM is also exploring new fields, such as speech recognition and translation, where you speak in one language and the text comes out in another. The company also has a long-term program to increase the accuracy of speech recognition by machines by a factor of 10, making it at least as accurate as a human listener. Weideman at ScanSoft predicted that speaker-independent speech recognition for the desktop will be available within two years. In other words, no training at all will be necessary, and the system will understand your voice from the moment you start talking - just like the systems used by airline call centers, when the machine asks what flight you want the schedule for. However, in the meantime, speech recognition remains a niche market of interest mostly to doctors, lawyers, and certain other professionals who are accustomed to dictating. Weideman estimates that the worldwide market is about $100 million. In the U.S. retail market, total sales of voice recognition software is about one-twentieth that of virus protection software, according to The NPD Group, a market research firm.
Lamont Wood