The Defense Department, through its Defense Advanced Research Projects Agency (DARPA), started funding academic and commercial research into speech recognition in the early 1970s.

What emerged were several systems to turn speech into text, all of which slowly but gradually improved as they were able to work with more data and at faster speeds.

In a brief interview, Dan Kaufman, director of DARPA’s Information Innovation Office, indicated that the government’s ability to automate transcription is still limited.


Experts in speech recognition say that in the last decade or so, the pace of technological improvement has been explosive. As information storage became cheaper and more efficient, technology companies were able to store massive amounts of voice data on their servers, allowing them to continually update and improve the models. Enormous processors, tuned as “deep neural networks” that detect patterns like human brains do, produce much cleaner transcripts.

And the Snowden documents show that the same kinds of leaps forward seen in commercial speech-to-text products have also been happening in secret at the NSA, fueled by the agency’s singular access to astronomical processing power and its own vast data archives.

In fact, the NSA has been repeatedly releasing new and improved speech recognition systems for more than a decade.

Dan Froomkin writing for The Intercept about how the NSA converts spoken words into searchable text.

