Wednesday, January 27, 2016

My Smartphone Hears Voices - And Transcribes Them

It is said that idiots are easily amused. By that criterion I am one. I am easily amused by the degree to which speech recognition software has improved from my college days 47 years ago to the present. It still has a few problems even today, but I am managing to write this post using the Google speech to text software built-in to this phone.

And that pleases me more than you can imagine, or at least more than words can express. I am having fun with a technology for the first time in years!

ANNOTATION from the desktop's keyboard: my dictation was halting, as you might expect of someone who is not by training or nature a dictator. But there were only two errors I could not easily correct by speaking rather than typing: one was the sentence "I am one"; the s/w was absolutely determined to change the "one" to the digit "1" no matter what I did. One time it even visibly transcribed it as "one" and then visibly changed it to "1". (sigh!) The other had to do with inserting explicit newlines. The s/w recognizes some punctuation and a few formatting characters when their names are spoken, and "newline" is one of them, but something about my Texas accent must have thrown off the recognition algorithm: sometimes it inserted a Unicode/ASCII newline character; sometimes it rendered it "Near line" or something even more unrecognizable. Even so, there were no more transcription errors than I learned to expect from the cheeky and uncooperative keypunch operator in my first job right out of college. (sigh again!)


  1. Omg, my daughter discovered this with Google talk and has written a four page paper with it! Soon we will lose the ability to write longhand and to use the keyboard and soon will do everything with solar powered gloves or something... The future is amazing!

    1. Your daughter should beware of "bare" where she meant "bear," "damn" in place of "dam," etc.; there are dozens of possibilities.

      Setting the flag that prevents profanities/obscenities helps some, but not in all cases. If you want "wright" you can expect to do battle with the s/w over "right" or "write"; you'll probably have to correct it from the keyboard when you do your final editing. I still do not know how the dam thing expects me to pronounce "newline," and I've fought with it, loaded for bare.

    2. I suppose the main thing to realize is that the s/w does not understand your speech; it recognizes the sounds you make and transcribes them. To some extent, it probably applies rules to choose among transcription possibilities, but until desktop processors are powerful enough to render transcribed sounds into language deep structure, we're not going to get, in our tiny portable devices, machine transcription as reliable as human transcription by someone who understands the language. Supercomputers probably already do that, and may even do respectable machine translation between two human languages, but don't expect NSA to tell us peons.

      I'll never forget the stupid joke going around in my college days about the English-German, German-English machine translator: according to the joke, a round-trip of "The spirit is willing but the flesh is weak" yielded "The brandy is good but the meat is rotten."

    3. BTW, I tested "more than I could bear," and Google transcribed it correctly. Also "Barenaked Ladies," and it fielded that one correctly as well.



• Click here to view existing comments.
• Or enter your new rhyme or reason
in the new comment box here.
• Or click the first Reply link below an existing
comment or reply and type in the
new reply box provided.
• Scrolling manually up and down the page
is also OK.

Static Pages (About, Quotes, etc.)

No Police Like H•lmes