[maemo-developers] One question about speech2text poor performance
From: Graham Cobb g+770 at cobb.uk.netDate: Tue Jan 15 21:59:29 EET 2008
- Previous message: One question about speech2text poor performance
- Next message: One question about speech2text poor performance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tuesday 15 January 2008 19:01:11 Mike Klein wrote: > History lesson: 1Mhz Apple][ had >90% voice recognition > ability...program was written by Bill Budge I believe. > > The resources have to be there. Back in the days when I knew something (not much) about speech-rec the issue was not CPU power, nor was it much about clever software. The biggest differentiator between products was the amount of speech sample data the vendor had access to. The more speech samples, the better the product. If the solution needed training it could be quite cheap -- cheap enough to be usable for disabled users to control computers, for example. But people hate training and untrained solutions were extremely expensive. At that time (about 10 years ago) what you were paying for when you bought a commercial speech-rec solution was the amount of money the vendor had spent in collecting samples: paying students, housewives, manual labourers, executives, children, etc., etc. everywhere the language was spoken to collect many, many samples of all the words they needed to recognise. It cost a lot of money and, not surprisingly, the people who collected and owned that data wanted lots of money to provide it. What was most noticeable was that recognition rates were dependent on the economic value of the language (not on CPU power or anything like that). American English was quite well recognised. French significantly less so. Dutch less still. And solutions were just not available at all for anything outside the top few languages (about 5-6 at that time, all Western European). Things may have improved in the last few years, but my guess is that until there is a wikipedia-style project to allow people to contribute free speech samples, there is unlikely to be very good open source speechrec: not because of the software but because of the speech samples. Oh, and by the way, if anyone wanted to volunteer to set up a website to collect samples please be aware that it is a very complex task: consult a speechrec engineer before even thinking about it. For example, it is necessary to review and process all the samples (still a human ear process as far as I know) and it is critical that the samples are tagged with information about the voice (language, dialect, sex, location, age, etc) and recording details (how was it acquired, in a recording studio or through a mobile phone?, what codec, etc). Of course, none of that necessarily explains why Nokia chose not to put (closed source) speechrec at least as good as that on their phones in the IT, although I would guess that licence fees to their suppliers may be part of that. Graham
- Previous message: One question about speech2text poor performance
- Next message: One question about speech2text poor performance
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]