[maemo-developers] One question about speech2text poor performance

Thu Jan 17 00:22:48 EET 2008

David Huggins-Daines wrote:
> Yes, this is exactly the case.  Recognizing a limited set of names in 
> isolation is not at all computationally intensive compared to 
> recognizing full sentences of connected words.
>
> See: http://en.wikipedia.org/wiki/Dynamic_time_warping

Also, Nokia has actually invested quite a lot of research into doing 
larger vocabulary speech recognition on their phones.  They are just 
recently able to do isolated word SMS dictation with a 22000 word 
vocabulary on a S60 2nd edition phone (sorry, abstract only):

http://portal.acm.org/citation.cfm?id=1180995.1181020

This is still a less complex problem than recognizing connected speech.

That said, I am still working on real-time 5000-word connected dictation 
on the N800/N810.  I've succeeded in offloading some computation to the 
DSP, and the next step is to implement model compression techniques 
similar to the ones mentioned in that paper.