Deepfake Reality: Promise vs Peril

Published on

April 1, 2024

Authors

Dr. Panagiota Antonakaki

Senior Data Scientist, Deeper Insights

Advancements in AI Newsletter

Subscribe to our Weekly Advances in AI newsletter now and get exclusive insights, updates and analysis delivered straight to your inbox.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

The digital era has ushered in groundbreaking advancements in artificial intelligence, propelling us into an age where reality and digital fabrication blend with startling clarity. Among these innovations, Deepfakes stand out as both a marvel and a menace. These AI-generated illusions have the power to amuse and astonish, offering glimpses into worlds where the impossible becomes possible. Yet, beneath their surface lies a darker potential: the capacity to undermine trust, distort truth, and challenge the very fabric of our shared reality. By exploring the dual nature of Deepfakes, we can navigate the fine line between their benefits and the ethical dilemmas they pose. Our journey through the technical intricacies, societal impacts, and the balancing act of harnessing such technology responsibly will shed light on this modern marvel's potential to reshape our world.

‍

The Technology Behind Deepfakes

‍

What is a Deepfake?

Let’s start at the beginning, Deepfakes are a type of synthetic media created using deep learning techniques, particularly generative adversarial networks (GANs) [1], to manipulate or generate visual and audio content. These sophisticated algorithms allow for the creation of hyper-realistic fake videos or audios that can convincingly depict individuals saying or doing things they never did.

How are Deepfakes built?

GANs consist of two neural networks: a generator and a discriminator. The generator is tasked with creating synthetic data, such as images or videos, while the discriminator's role is to differentiate between real and fake data. These networks engage in a continual process of learning and improvement, with the generator trying to produce increasingly realistic outputs that can deceive the discriminator. In the context of Deepfakes, GANs are utilised to generate highly convincing synthetic images or videos of specific individuals. Initially, the GAN is trained on a dataset containing images or videos of the target individual.

Through this training process, the generator learns to generate new images or videos that closely resemble the appearance and mannerisms of the target person. Once the GAN is trained, it can be used to create deepfake content by altering existing images or videos to make them appear as if they were spoken or performed by the target individual. This process involves mapping the facial expressions, gestures, and vocal patterns of the target person onto the source content, resulting in a seamless and realistic simulation. Recently, a novel framework has been proposed for generating highly realistic and expressive talking head videos by only using one source image [2].

‍

The Dark Side of Deepfakes

The widespread use of deepfakes presents significant threats to society, governments, and political life. They can be used to spread disinformation, manipulate public opinion, and even affect global stock markets or incite terrorist actions.

Threats to Society and Governments

In 2014, a researcher at the University of Washington circulated a fabricated video featuring Barack Obama, sparking concerns among tech experts about the potential crisis if generative AI remained unregulated [3]. The accessibility of generative AI has since expanded significantly, with sophisticated image generators now accessible to anyone with an email account.

Malicious examples have included mimicking a manager’s instructions to employees, generating fake messages to a family in distress and distributing false embarrassing photos of individuals (e.g. Taylor Swift, Tom Hanks and many more).

The extensive use of Deepfakes might result in the replacement of non-celebrity actors, potentially causing job displacement or altering the industry's approach to talent recruitment, with the emergence of Deepfake actors. See the example of Tom Cruise who is going viral on the internet with face swap.

Naturally, versatile technology like this also holds the risk of misuse, such as for fraudulent impersonation to attain financial gain or for illicit access to voice-gated systems.

Why are Deepfakes not ruling the world? It’s not easy

While convincing celebrity Deepfakes might suggest that creating life-like Deepfakes is straightforward, research from RAND (4) indicates otherwise. Viral Deepfake videos of Hollywood quality often demand extensive resources, such as months of training, costly graphics processing units, and skilled actors to mimic mannerisms accurately. Given the effort and expense involved, those disseminating misinformation may find it simpler to rely on traditional media formats. Misinformation can take various cheaper forms, including memes, narratives and out of the context videos.

Deepfakes often struggle with accurately replicating subtle details and nuances unique to individual faces, leading to inconsistencies and artefacts in the generated content. Also, when taken humans are known to take the context out of the videos, or the image, and judge if it’s fake or not. However, with ongoing advancements in machine learning algorithms and computational power, these limitations are expected to diminish over time.

‍

Deepfakes for Good

While Deepfakes raise ethical concerns, they also offer advantages in entertainment, gaming, and education. Filmmakers can realistically depict actors who are unavailable or deceased, enhancing storytelling. In gaming, immersive experiences are possible by incorporating deepfake technology.

Additionally, deepfakes can facilitate historical education and improve mental health interventions through virtual reality simulations.

Revolutionising Gaming

Many games are built on the fundamental idea of players stepping into the shoes of a character, like Luke Skywalker. However, a heightened level of immersion would involve more than just controlling Luke with a gamepad; it would entail the avatar mirroring your facial expressions and mouth movements. Deepfakes also have the potential to alleviate some of the stress associated with certain aspects of content creation. By streamlining processes such as character animation and casting, Deepfake technology can reduce the time and resources required for certain tasks, potentially reducing workload pressures and deadlines.

Additionally, the ability to generate realistic animations and visual effects more efficiently may allow developers and filmmakers to focus more on creative aspects of their work, rather than labour-intensive technical tasks.

Hollywood Reimagined

In an HBO documentary (“Welcome to Chechnya”[5]) portraying the struggles of LGBTQ+ activists living in secrecy under the threat of execution, deepfake technology was employed. Additionally, Deepfakes have been utilised to generate customised voices for the millions of individuals dependent on synthetic speech for communication.

Another example is David Beckham speaking nine languages to launch Malaria No More where Deep fake voice tech was used for good. Applications in the film industry can bring back to life dead actors (brief appearance of Peter Cushing in the 2016 Star Wars film Rogue One, and James Dean being featured in the 2020 film Finding Jack), of even age or de-age real actors depending on the requirements of the movie.

Transforming Education

The use of Deepfake technology in the Dali Museum provides an innovative approach to history education and museum experiences [6]. By using this technology to create realistic simulations of historical figures like Salvador Dali, museums can offer immersive and engaging educational experiences for visitors. This technology allows for a more dynamic and interactive exploration of history, offering visitors a unique opportunity to engage with historical figures in a way that traditional exhibits cannot replicate.

In addition to enhancing museum experiences, the use of Deepfake technology in education has the potential to revolutionise teaching and learning, offering new opportunities for immersive and interactive educational experiences both inside and outside the classroom. For instance, educators can utilise Deepfake technology to bring historical figures, literary characters, or scientific concepts to life in the classroom making abstract or complex topics more tangible and relatable for students. This interactive approach to education can foster greater student engagement and enable personalised learning experiences tailored to the needs and interests of individual students, promoting a more inclusive and effective educational environment.

Deepfakes and Mental Health

Deepfake technology has shown promising applications in the realm of mental health and pain management, as demonstrated by an emotional example in a 2020 South Korean documentary. In this documentary, a grieving mother was able to find solace and closure by virtually reuniting with her deceased daughter through a Deepfake simulation. The mother could engage in a heartfelt conversation and emotional connection with her daughter, providing a powerful source of comfort and healing, leading to the woman ultimately expressing nothing but gratitude for the unorthodox project.

This innovative use of deepfakes in conjunction with Virtual Reality (VR) illustrates their potential to offer therapeutic benefits for individuals coping with grief, trauma, or chronic pain. Beyond this specific example, deepfake simulations hold promise for providing personalised and immersive interventions in mental health care, offering unique opportunities for therapeutic expression, emotional support, and pain management. As research in this area continues to evolve, deepfake technology may become increasingly integrated into mental health interventions, contributing to improved well-being and quality of life for individuals facing various challenges.

‍

Detecting and Dismantling Deepfakes

Detecting and mitigating the spread of deepfakes has become a pressing concern for various stakeholders like tech companies, media organisations and policymakers. However, the task of detecting Deepfakes poses several challenges.

Navigating the Deepfake Arms Race

One of the primary challenges in Deepfake detection lies in the rapid evolution and sophistication of Deepfake technology itself. Moreover, the accessibility of deepfake tools and software means that virtually anyone with basic technical knowledge can create convincing deepfake videos, further complicating detection efforts. Furthermore, Deepfake creators often employ sophisticated techniques to evade detection, such as manipulating metadata or using adversarial attacks to fool detection algorithms. These evasion tactics require detection systems to continually adapt and evolve to stay ahead of emerging threats.

Detecting the Unreal through Life's Signs

Despite these challenges, significant progress has been made in the field of deepfake detection through the development of advanced detection algorithms and machine learning techniques. These detection systems utilise a variety of approaches that can be separated (not exclusively of course) in image/audio analysis and biological analysis. The first ones include analysing facial and audio cues for inconsistencies and utilising deep learning for detecting anomalies in video frames and the second ones include analysis of biological signals like eye blinking [7] or lips movement [8] and heartbeat [9] for deep fake detection.

Frontiers in Deepfake Detection

To name a few available detection systems, Sentinel AI [10] utilises machine learning algorithms to examine biological signals, facial recognition data, and audio data for the identification of deepfake manipulation. By leveraging deep learning architecture and convolutional neural networks, it effectively detects deepfake images and videos. Sensity [11] employs fine-tuned algorithms to pinpoint distinct artefacts and high-frequency signals inherent in AI-generated images, distinguishing them from natural photographs.

Additionally, the Phoneme-Viseme Mismatch Detector developed by Stanford University [8], takes advantage of inconsistencies between mouth shapes (visemes) and spoken phonemes often present in deepfakes for further detection accuracy.

Empowering Minds Against Deepfakes

While deepfake detection technology continues to improve, it is essential to recognize that detection alone may not be sufficient to address the broader challenges posed by deepfakes. Education, media literacy, and responsible online behaviour are equally critical in combating the spread of misinformation and disinformation. As shown in a study of political speeches [12] people are very good at putting the information in context and rely more on how something is said and the visual clues they see, rather than just the words being spoken. This means that people are less likely to believe a video with a known actor claiming things that are not consistent with his personality but also they intuitively recognise visual cues that help them detect that the video is fake.

‍

Final Thoughts

Like any emerging technology, Deepfakes carry the dual potential to benefit or harm society. While they are frequently portrayed as a significant threat, their ability to contribute positively should not be overlooked. This technology has been unleashed upon the world, and we are gradually beginning to witness its effects. As Deepfake technology continues to evolve and improve, it demands from us an unprecedented level of vigilance. Embracing a culture of awareness and accountability is essential to managing the challenges Deepfakes present. Our journey forward is not solely about reducing risks. It's about guiding technological advancement in a manner that enhances our community, honours our rights, and safeguards the core of our collective reality.