The never ending race to outsmart AI-generated media
It is highly probable that while browsing the Internet, everyone of us has at some point stumbled upon a deepfake video. Deepfakes usually depict well-known people doing really improbable things – like the queen of England dancing on her table or Ron Swanson from Parks and Recreations starring as every single character in the Full House series. These two examples of AI-generated and at times highly realistic looking videos containing manipulated imagery are really easily spotted and they were never meant to be taken seriously in the first place. But the technology to produce such footage is already in wide-use and anyone with enough interest and time on their hands can try and create one. This is where the topic gets serious and potentially dangerous. So until recently it was really easy to spot an AI crafted video by being on the lookout for one of the following dead giveaways:
- lighting does not match the setting;
- audio is out of sync;
- blurry parts, mainly in the areas of the neck and hair;
- patches of skin not matching the rest of the subject’s skin color;
But as the AI models advance, these little glitches will no longer help us to tell the real deal from a fake. But first, let’s find out how are those videos actually created.
How are deepfakes made?
Not long ago, we discussed the role of generative adversarial networks (GANs) in the creation of fake imagery. Well, in the case of deepfake videos, first an artificial neural network (ANN), called autoencoder, analyses videos and photos of the subject in different angles and isolates the essential features it discovers. Parting from these features, the ANN would be able to generate new images of the subject. But as we need to swap the subject with another (or in this case, the face of the subject), we use another ANN to reconstruct our subject, an ANN trained on samples from the subject with whom we want our face exchanged. This other ANN then reconstructs the subjects face, mimicking the behavior and speech patterns the first ANN learned. Afterwards, a GAN seeks out flaws and improves and polishes the results into near perfection.
And here lies the problem of deepfake detection – since deepfakes are created using adversarial training, the algorithm creating the fakes will get better every time when introduced to new detection systems. It is a race that cannot be won, because the adversarial networks are designed to always improve each other.
Misuse of deepfakes and emerging problems
As with every invention, the generation of artificial images or speech can be a double-edged sword. Machine learning is getting increasingly better at everything it does and although right now, telling the real deal from the AIs work can be at times very easy, GANs are getting better all the time and it is only a question of time until there is no way to tell them apart just by looking or listening to them. We are talking about audio or video recordings that look very genuine but are not.
There are already been reported cases of frauds where computer generated media played a major role. One example is a company whose employee was scammed into wiring a considerable amount of money. He received a call, in which what seemed to be his superior instructed him to do so. He also received an email confirming this transaction. But little did he know that the voice he was hearing was not that of his boss, but a really good imitation, generated by scammers.
Another example of AI misuse and a growing problem is the creation of authentic looking, but fake pornography, where the victim’s face is used to generate fake nude images. This includes revenge porn as well as fake celebrity porn. The damage it may cause to the victims is obvious.
Moreover, there is the possibility of weaponizing deepfakes on social media by misinforming and manipulating the viewers. Imagine a viral video of a politician, saying things that he/she never said and manipulating the viewers into thinking the footage is real.
Deepfakes also pose a potential threat to identity verification technology, possibly allowing scammers to bypass biometric facial recognition systems.
This is why the deepfakes detection software has came to be of big interest.
The Problem With Deepfake Detection Models
AI researchers are doing their best to develop algorithms to spot deepfake videos. But this is a technically demanding and difficult challenge. Some of the interesting forgery detection models include:
- analysis of the eye blinking: the generative models responsible for creation of the videos need to be fed some source data – images of the subject it has to imitate. The images deepfake models used did not contain a high number of images depicting people with their eyes closed, leading them to generate footage where the subjects’ blinking patterns were unnatural.
- remote heart rate estimation: this detection framework is focusing on trying to detect the heart rate of the subject, by looking for subtle changes of the skin color, so that the presence of blood under the skin can be confirmed.
- tracking small facial movements unique to each individual: this model relies on isolating the distinctive facial expression which are unique to each person. It then compares if these expressions are present in the assessed video of the subject.
So far, it seems we are on our way to win the war on deepfakes. But wait, there is a catch. As we said before, the deep networks responsible to generating this fake imagery can themselves be trained to learn how to avoid being detected. This leads to a cat and mouse kind of situation, where every time a new model for detection is presented, a better trained deepfake generator follows shortly after. An actual example of this is the model which detected fakes by assessing the subjects eye blinking patterns. Shortly after the paper describing this detection model was published, the deepfake models corrected this error.
The Deepfake Detection Challenge
Until recently there was a lack of big data set or benchmarks to train detection models. And we say until recently, because thanks to the Deepfake Detection Challenge (DFDC) organized by Facebook together with other industry leaders and academics, a huge dataset of videos (over 100,000) was shared publicly. Thanks to this dataset, participants of DFDC could train and test their detection models. More than 2,000 participants submitted over 35,000 models for the competition. The results were announced last year, and the winning model achieved a precision of 65%. This means, that 35% of the videos were marked as deepfakes even though they were not (a ‘false positive’ error). Let us be honest, these numbers are not too impressive…
DARPA’s SemaFor Program
DARPA, the US agency famous for innovative technologies development, decided to also jump on the deepfake detection train by launching a program called SemaFor (Semantics Forensics). It’s objective is to design a system that could automatically detect all types of manipulated media, by combining three different types of algorithms: text analysis, audio analysis and video content analysis. Their algorithms will be trained on 250,000 news articles and 250,000 social media posts, including 5,000 fake items.
Microsoft’s Video Authenticator
In September 2020, the tech giant Microsoft released a tool, designed to help distinguish fake videos by providing a numeric probability, the confidence score, that the media was manipulated by an AI. The tool will not be released to public directly, because then the deepfake creators could potentially use its code to teach their models to evade the detection.
Beyond deepfake detection
Because of the fact, that every time a new media manipulation detection method of is published, it is only a question of time when it will be surpassed by a better, smarter fakes-creating algorithm. This is why, in order to lower the risks associated with the spreading of forged multimedia, a more holistic approach needs to be taken. The solution seems to lie in a combination of:
- media authentication - by using watermarks, digital fingerprints or signatures in the media’s metadata and by using blockchain technologies;
- media provenance - by providing information on the media origin and reverse media searching;
The ability to detect fake multimedia is among one of the top challenges we are facing currently in the world of technology. Ironically, every time a new detection model is used published, it leads to improvement in the fakes generating models. In this way, we can expect to see far more believable and realistic deepfakes in the future. To fight against the misuse of such media, additional measures such as media authentication and media provenance need to be adapted.
Image by Gerd Altmann from Pixabay
Originally coming from a marketing background, decided to turn her life around and immerse herself into the wonderful exciting and most importantly – never boring world of technology and web development. Proud employee at MA-NO . Easily loses track of time when enjoying working on code. Big fan of Placebo, cats and pizza.
The Impact of automation and Robots on human jobs: exploring controversies and opportunities
Automation and technological advancements have raised concerns in some sectors about the possibility of robots taking away human jobs. While it is true that robots and artificial intelligence can perform…
Is AI sexist? A gender perspective in Robotics and Artificial Intelligence
In her article, Maria Antonia Huertas Sánchez of the UOC - Universitat Oberta de Catalunya, provides an explanation of why we should incorporate a gender vision in robotics and artificial…
Hidden Gmail codes to find a lost e-mail
If you have a lot of emails in Gmail, there are a few codes that will help you find what you need faster and more accurately than if you do…
How to download an email in PDF format in Gmail for Android
You will see how easy it is to save an email you have received or sent yourself from Gmail in PDF format, all with your Android smartphone. Here's how it's…
AI predicts crimes in the US like Minority Report
As in Spielberg's film based on Philip K. Dick's novel, an AI model was able to predict crimes in some American cities before they happened. However, it also revealed the…
How to Send Email from an HTML Contact Form
In today’s article we will write about how to make a working form that upon hitting that submit button will be functional and send the email (to you as a…
7 Astonishing New Uses of Machine Learning
Recently a strange video published on YouTube caused a controversy – it was a funny take on Queen Elizabeth’s traditional Christmas message created by Channel 4, a British public-service television broadcaster. They…
What is DNS Blockchain and how to use it
To be able to navigate the Internet, to enter a website and have it show us all the content we want and search for, certain functions and characteristics are necessary.…
GAN Generated Images
Let’s play a game - can you guess what these portraits have in common? They all depict non-existent people. All these images were created by artificial intelligence. We could say, that…
Hashmap: hashing, collisions and first functions
Today we are going to study some concepts closely related to hashmaps. The concepts we are going to see are hashing and collisions. Hashing The idea of hashing with chaining is to…
How are businesses using artificial intelligence?
The term Artificial Intelligence (AI), "the intelligence of the machines", unifies two words that, a priori, could not be associated, since the term intelligence is attributed to the faculty of…
What is the origin of the word SPAM?
It is 40 years since the first spam e-mail was sent. Surely most of the readers of this blog had not yet been born, as well as annoying mail with which we are bombarded in…