Combating Deepfakes – Inside the Next AI Battleground

By Andy Morris, IBM Cognitive Infrastructure

July 2, 2019

Combating Deepfakes – Inside the Next AI Battleground


Apocalyptic warnings about the risks of AI are hardly new. Stephen Hawking warned that “the development of full AI could spell the end of the human race” 1 and Elon Musk has declared unregulated AI more dangerous than nuclear weapons.2 While computers aren’t out-thinking humans just yet, we’re already confronting the first challenge of the AI-age – the rise of “deepfakes”, exploiting AI techniques to undermine trust and threaten our ability to know what is real.


The challenge of deepfakes


A deepfake refers to the use of machine learning algorithms to generate fake images or videos that are difficult to distinguish from reality. Two years ago, the term was largely associated with false likenesses of celebrities. This is bad enough, but in our charged political climate, rife with accusations of “fake news” and social media manipulation, the risks have become more dire. The problem of deepfakes has grown beyond simple beyond face-swapping and video manipulation. Fake images and videos can now be created entirely from scratch with only modest investments in tools and computing capacity.


A bigger problem than just videos


AI-powered systems can be used to generate fake text, online experiences, and even phone conversations. To raise public awareness of the dangers posed by deepfakes, OpenAI, a non-profit dedicated to safe and beneficial AI, released GPT-2 in February of 2019, an AI-powered algorithm able to write text and articles.3  OpenAI released an intentionally hobbled version of their model trained with a small dataset because they viewed the technology as dangerous.  Prompted with a sentence or two, GPT-2 can expound on any topic, writing paragraphs of text that most readers would assume were written by a human.


The impacts of AI-generated writing are profound. From social media posts to chatbots to e-mail correspondence, it is becoming harder to recognize computer-generated content.  By some estimates, up to 30% of product reviews on Amazon are now fake, posing serious brand-related risks to suppliers, retailers, and online providers. 4  


AI generated text coupled with fake images, video, and voice synthesis dramatically lowers the cost of spreading mischief.  Aside from spreading chaos in our politics, we can expect a new wave of cybersecurity threats such as digital forgery, identity theft enabled by hacked digital voices, and even law enforcement challenges – “Proof beyond a reasonable doubt” will become a much higher threshold as the public becomes increasingly mistrustful of all digital media.


[Learn how organizations can avoid making AI architecture mistakes.]


A cat and mouse game


To counter threats such as AI generated text, researchers at the MIT-IBM Watson lab and AI HarvardNLP introduced the Giant Language Test Room (GLTR), deep learning models designed to detect fake text generated by models such as OpenAI’s GPT-2. 5


We’re in the midst of an escalating battle between AI-based image/video/audio/text generators and AI-based detectors. This cat and mouse game is similar to the offense/defense dynamic that plays out with counterfeit currencies, bacteria and antibiotics, and malicious software vs. anti-virus platforms.


Generative Adversarial Networks (GANs) are a relatively new concept in machine learning. A deep learning model can be trained to generate fake images based on a training dataset. The generator can then be coupled with a discriminator – a separate deep learning model trained to predict whether images created by the generator are real or fake. These opposing models (GANs) learn from one another, such that both become increasingly sophisticated over time, giving deepfake creaters an edge.


[Not all GANs are evil – read the IBM Research Blog GANS for Good]


As firms “up their game” detecting fake video and text, attackers need to fool not only humans but AI-based deepfake detection algorithms as well. Business guarding against fraud are deploying ensembles of detection algorithms, but if the detectors are known in advance, adversaries can train their models to defeat detection.


IBM and others are understandably “tight-lipped” about state-of-the-art detectors. Customers can employ solutions, including IBM Watson Studio, to develop high-quality models to guard against a wide variety of deepfakes and cyber threats. IBM Watson Machine Learning Accelerator helps organizations deploy, manage, and maintain these models in production


Human limitations throw fuel on the fire


Distinguishing between legitimate and fake content is more than just a technology problem. There are deeper issues at play, including human bias and media literacy. As humans, we have “cognitive glitches” that can be exploited by clever adversaries. A good example is confirmation bias, where we tend to seek out and trust information that confirms our prior beliefs. Human neural networks are harder to ‘retrain’ with contrary ideas than their silicon counterparts. Even the best fake detector can’t necessarily overrule an audience predisposed to believe misinformation.


Quality and veracity of training data is another issue in combating AI-driven misinformation. In several well-publicized examples, facial recognition systems used by police have proven controversial by disproportionately misidentifying faces of minorities – a consequence of training datasets over-represented by white males. 6 IBM is active here also providing the Diversity in Faces (DiF) dataset to help organizations develop facial recognition that is more fair and free of general and racial bias.


Pure research in understanding context, and how we, as humans think, will help us build more capable algorithms over time. Efforts such as IBM’s Project Debater, a follow-on to earlier efforts such as IBM Deep Blue and IBM Watson (the reigning Jeopardy champ) will provide new insights into how humans think and process information, and provide new tools to help combat deepfakes and fraud.




1.     Stephen Hawking on Artificial Intelligence – BBC, 2014:  

2.     Elon Musk, SXSW, March 2018 –

3.     Source-code available for GPT-2 illustrates how easy it is to generate fake text –

4.     CBS New Feb 2019 – Buyer Beware – the scourge of fake reviews gitting Amazon, Walmart and other major retailers – 

5.     Catching a Unicorn with GLTR: a tool to detect computer generated text –

6.     The problem with AI?  Study says it’s too white and male, calls for more women, minorities –

Return to Solution Channel Homepage

IBM Resources

Follow @IBMSystems

IBM Systems on Facebook

Do NOT follow this link or you will be banned from the site!
Share This