Ms. Tech

Artificial Intelligence / Machine Learning

An AI that writes convincing prose risks mass-producing fake news

Fed with billions of words, this algorithm creates convincing articles and shows how AI could be used to fool people on a mass scale.

Feb 14, 2019
Ms. Tech

Here’s some breaking fake news …

Russia has declared war on the United States after Donald Trump accidentally fired a missile in the air.

Russia said it had “identified the missile’s trajectory and will take necessary measures to ensure the security of the Russian population and the country’s strategic nuclear forces.” The White House said it was “extremely concerned by the Russian violation” of a treaty banning intermediate-range ballistic missiles.

The US and Russia have had an uneasy relationship since 2014, when Moscow annexed Ukraine’s Crimea region and backed separatists in eastern Ukraine.

That story is, in fact, not only fake, but a troubling example of just how good AI is getting at fooling us. 

That’s because it wasn’t written by a person; it was auto-generated by an algorithm fed the words “Russia has declared war on the United States after Donald Trump accidentally …”

The program made the rest of the story up on its own. And it can make up realistic-seeming news reports on any topic you give it. The program was developed by a team at OpenAI, a research institute based in San Francisco.

The researchers set out to develop a general-purpose language algorithm, trained on a vast amount of text from the web, that would be capable of translating text, answering questions, and performing other useful tasks. But they soon grew concerned about the potential for abuse. “We started testing it, and quickly discovered it’s possible to generate malicious-esque content quite easily,” says Jack Clark, policy director at OpenAI.

Clark says the program hints at how AI might be used to automate the generation of convincing fake news, social-media posts, or other text content. Such a tool could spew out climate-denying news reports or scandalous exposés during an election. Fake news is already a problem, but if it were automated, it might be harder to tune out. Perhaps it could be optimized for particular demographics—or even individuals. 

Clark says it may not be long before AI can reliably produce fake stories, bogus tweets, or duplicitous comments that are even more convincing. “It’s very clear that if this technology matures—and I’d give it one or two years—it could be used for disinformation or propaganda,” he says. “We’re trying to get ahead of this.”

Such technology could have beneficial uses, including summarizing text or improving the conversational skills of chatbots. Clark says he has even used the tool to generate passages in short science fiction stories with surprising success.

OpenAI does fundamental AI research but also plays an active role in highlighting the potential risks of artificial intelligence. The organization was involved with a 2018 report on the risks of AI, including opportunities for misinformation (see “These are the ‘Black Mirror’ Scenarios that are leading some experts to call for secrecy on AI”).

The OpenAI algorithm is not always convincing to the discerning reader. A lot of the time, when given a prompt, it produces superficially coherent gibberish or text that clearly seems to have been cribbed from online news sources.

It is, however, often remarkably good at producing realistic text, and it reflects recent advances in applying machine learning to language.

OpenAI made the text generation tool available for MIT Technology Review to test but, because of concerns about how the technology might be misused, will make only a simplified version publicly available. The institute is publishing a research paper outlining the work. 

Progress in artificial intelligence is gradually helping machines gain a better grasp of language. Recent work has made progress by feeding general-purpose machine-learning algorithms very large amounts of text. The OpenAI program takes this to a new level: the system was fed 45 million pages from the web, chosen via the website Reddit. And in contrast to most language algorithms, the OpenAI program does not require labeled or curated text. It simply learns to recognize patterns in the data it’s fed.

Richard Socher, an expert on natural-language processing and the chief scientist at Salesforce, says the OpenAI work is a good example of a more general-purpose language learning system. “I think these general learning systems are the future,” he wrote in an e-mail. 

On the other hand, Socher is less concerned about the potential for deception and misinformation. “You don’t need AI to create fake news,” he says. “People can easily do it :)”