Credits By: AI Demand
Remember the selfie you posted last week? Presently, no impediment prevents individuals from appropriating and manipulating it through potent AI systems that generate content. Even more unsettling is that, due to the advanced nature of these systems, verifying the authenticity of the resulting image may be impossible.
Researchers from the Massachusetts Institute of Technology (MIT) have devised a novel tool, PhotoGuard, that could provide a remedy for this dilemma.
PhotoGuard functions similarly to a safeguard, introducing imperceptible photograph modifications that effectively prevent manipulation. If an image “immunized” by PhotoGuard were edited using an application based on a generative AI model, such as Stable Diffusion, the result would be unrealistic or distorted.
Hadi Salman, an MIT-affiliated doctoral researcher who participated in the study, asserts that “anyone can appropriate our images, alter them at will, portray us in derogatory scenarios, and even resort to blackmail.” These claims were recently presented at the International Conference on Machine Learning.
Salman explains that PhotoGuard represents a conscientious effort to address the malicious image manipulation perpetrated by these models. This instrument is helpful in situations such as preventing the transformation of women’s selfies into non-consensual deepfake pornography.
There has never been a greater need to devise methods for detecting and preventing AI-driven manipulation, given that generative AI tools have facilitated such activities with unprecedented speed and simplicity. In voluntary alignment with the White House, leading AI entities, including OpenAI, Google, and Meta, have pledged to develop methodologies to combat fraud and deceit. In this environment, PhotoGuard functions as a complementary technique to watermarking, intending to prevent the initial use of AI tools for image tampering, as opposed to watermarking, which detects AI-generated content post-creation utilizing analogously concealed signals.
Using Stable Diffusion’s open-source image generation model, the MIT team employed two strategies to counteract image manipulation.
The first method, an encoder attack, entails PhotoGuard embedding imperceptible signals within an image, causing the AI model to misinterpret it. For instance, these signals could cause the AI to classify an image of Trevor Noah as a grey expanse. Therefore, any attempt to use Stable Diffusion to integrate Noah into various scenarios would produce unconvincing results.
The second technique, which is more effective than the first, is known as a diffusion attack. It disrupts the mode of operation of AI models by encoding images with hidden signals that alter their processing by the model. By superimposing these signals onto a picture of Trevor Noah, the team manipulated the diffusion model to ignore its initial prompt and produce the desired image. Therefore, any AI-modified images of Noah would only display hues of grey.
Ben Zhao, a computer science professor at the University of Chicago who devised a similar method known as Glaze for protecting artists’ creations from AI use, lauds the work as a harmonious synthesis of concrete necessity and current practicability.
Emily Wenger, a research scientist at Meta who contributed to Glaze and pioneered techniques to thwart facial recognition, hypothesizes that tools like PhotoGuard alter the calculus of attackers, thereby heightening the difficulties associated with malicious AI deployment.
According to Wenger, “the higher the threshold, the fewer the individuals able or willing to surpass it.”
According to Zhao, determining the generalizability of this technique across various existing models is a formidable obstacle. The researchers have disseminated an online demonstration allowing individuals to immunize their own photographs; however, its reliability is limited to Stable Diffusion only.
PhotoGuard could undoubtedly increase the difficulty of tampering with novel images, but its defense against deep fakes is insufficient. Valeriia Cherepanova, a doctoral student at the University of Maryland who pioneered methods for protecting social media users against facial recognition, explains that this is due to the potential availability of users’ historical images for malicious exploitation and alternative pathways for producing deep fakes.
Aleksandr Madry, a professor at MIT who participated in the research, hypothesized that individuals could theoretically employ this protective shield before uploading their images online. However, a more effective strategy would be for technology companies to automatically integrate PhotoGuard into their platforms to fortify user-submitted photos.
This endeavor, however, is marked by an inherent arms competition. While tech companies are committed to augmenting protective methods, their concurrent acceleration in developing novel and advanced AI models introduces the possibility of these models subverting new safeguards.
According to Salman, ideally, corporations devoted to AI model development would facilitate a method for end-users to render their images compatible with all subsequent AI iterations.
Henry Ajder, an expert in generative AI and deep fakes, asserts that protecting images at their source against AI manipulation is a significantly more viable strategy than relying on unreliable methods to detect AI interference.
Ajder emphasizes that any social media platform or AI entity must prioritize protecting users from the dangers of non-consensual pornography and unauthorized replicating of their faces for defamatory purposes.

