Why Is It So Hard to Make a Computer Talk Like a Human?

The next generation of computerized voices has to be human enough to connect with but not so human that we feel we’re being lied to. That’s no small feat.

Starre Julia Vartan
OneZero

--

Illustration: Will Harvey

WWhen our machines first began speaking to us, it was in the simple language of children. Some of those voices were even designed for kids — my Speak & Spell was a box with a handle and a tiny green screen that tested my skills in a grating tone, but I still heard that voice sometimes in my dreams. Teddy Ruxpin’s words played from cassette tapes popped into his back, but his mouth moved at just the right cadence, which made him feel almost alive. At least to a kid.

For adults, however, the clunky computerized voices of the 1980s, ’90s, and early aughts were far from real. When the train’s voice announced that the next stop was Port Chester using two words instead of “porchester” — we knew: That was a machine. It could not know that we New Yorkers pronounced this place as one word, not two. It was simple: A voice that sounded human was a person; a voice that sounded like a machine was a machine.

This was fine when all we needed were announcements that were basic, short phrases. But if there is a fire on the train, we all instinctively want to hear…

--

--