Text-to-speech applications (opinion)
Text-to-speech applications (opinion)

Early one morning last week, I found myself stretched out on a gurney and looking at the ceiling of an operating room, with a nurse coming around every few minutes to put drops in my right eye. It and my right ear have been getting worse and worse at their jobs for a long time now, though the joint decline (likely a matter of genetics, according to a doctor) has been accelerating since the middle of the last decade.

By then, I had written a little on disability scholarship -- out of intellectual curiosity, though, rather than any feeling of personal relevance. But there was no getting around the fact of being all but deaf on one side. You can only joke about needing an ear horn so many times. The eyeglasses were not preventing strain headaches, and, in any case, I kept thinking of that early episode of The Twilight Zone with Burgess Meredith as a bookworm whose postapocalyptic reading plans are so cruelly shattered, along with his spectacles.

That these were the shallower depths of disability is confirmed by being able to refer to them in the past tense, thanks to a hearing aid and cataract surgery. I am now a plausible simulacrum of able-bodiedness (for however much longer). But for several days, reading was either proscribed by the ophthalmologist or too physically disagreeable to take in more than headlines and streaming-video titles. The essayist's job is to ruminate, though preferably after chewing new cud. After a couple of days, looking into text-to-speech applications crossed my mind as preferable to going stir-crazy.

Audiobooks and podcasts galore are available, of course, making good use of human vocal talent, but keeping up with articles online or published in PDF is another matter. For that, you need a robot, or an app that will probably sound like one.

So for a couple of days, I tested numerous text-to-speech programs for both laptop and e-reader in search of one with a vocal quality that would not make HAL in 2001: A Space Odyssey seem like a graduate of the Royal Academy of Dramatic Art by contrast. I kept notes on the available features and how well the apps performed, with some thought of preparing a Consumer Reports-ish article.

But in the overwhelming majority of cases, the evaluation ranged from "somewhat robotic, tolerable" to "extremely robotic, barely possible to follow it." Either programmers have (a) applied their talents to developing what is, for all practical purposes, the same voice or (b) repackaged much the same software with a few added features here and there at various prices.

Sometimes you have the option to export the text as an audio file. An app may be able to read PDFs aloud; if so, it will pronounce all the download information at the bottom of a page before continuing with the text itself. Some apps will let you pause the reading and then resume, while others automatically return to the start of the document. Over all, it tends to be a choice between more or less irritating and more or less expensive. The text-to-speech capacities built into the Safari browser or the reading application Pocket are par for the course, with the slight advantage of being free to users.

But a couple of text-to-speech applications did stand out from the rest and have already become part of my workflow. One is Read Aloud, an extension available for both the Chrome and Firefox browsers. The user can choose from a considerable range of voices and accents, and the speed, pitch and volume can each be adjusted. The option of pausing and resuming is not available, as far as I can tell, but you can use the cursor to select where to start reading. It's unlikely anyone would confuse Read Aloud with a live human, but the timbre and pronunciation are close enough to be comprehensible and, just as important, tolerable for listening to something the length of, say, a New York Review of Books article. Written and video instructions on how to add the Read Aloud extension to Chrome or Firefox are readily available via search engine.

Read Aloud is free, while Voice Dream -- available for iPad and iPhone -- sells through the Apple's app store starting at about $20. To be clear about this, I have no relationship to Apple or to Voice Dream's creators and probably would not have risked buying it if not for the recommendation from a friend who told me he got the app several years ago and still uses it almost daily. That last detail counted for a lot.

Voice Dream can handle material from a variety of sources -- websites, text files, articles saved on Pocket and various other apps, and at least a couple of things I haven't encountered before. It reads PDF with the same glitch as other apps I've tried: an inability to tell where the text leaves off and bottom-of-the-page stuff (footnotes, author bio, source HTML, etc.) ends. Text-to-speech programs supplemented with artificial intelligence will be able to handle that, sooner or later, along with the peculiar and treacherous nature of English-language orthography's relationship to phonetics. An app that can recognize the difference between "project" as a noun and a verb, for example, will be that much closer to the inner voice of the reader.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *