
AI-enabled deception now permeates our online lives. There are the high-profile cases you may easily spot, like when White House officials recently shared a manipulated image of a protester in Minnesota and then mocked those asking about it. Other times, it slips quietly into social media feeds and racks up views, like the videos that Russian influence campaigns are currently spreading to discourage Ukrainians from enlisting.
It is into this mess that Microsoft has put forward a blueprint, shared with MIT Technology Review, for how to prove what’s real online.
An AI safety research team at the company recently evaluated how methods for documenting digital manipulation are faring against today’s most worrying AI developments, like interactive deepfakes and widely accessible hyperrealistic models. It then recommended technical standards that can be adopted by AI companies and social media platforms.
To understand the gold standard that Microsoft is pushing, imagine you have a Rembrandt painting and you are trying to document its authenticity. You might describe its provenance with a detailed manifest of where the painting came from and all the times it changed hands. You might apply a watermark that would be invisible to humans but readable by a machine. And you could digitally scan the painting and generate a mathematical signature, like a fingerprint, based on the brush strokes. If you showed the piece at a museum, a skeptical visitor could then examine these proofs to verify that it’s an original.
All of these methods are already being used to varying degrees in the effort to vet content online. Microsoft evaluated 60 different combinations of them, modeling how each setup would hold up under different failure scenarios—from metadata being stripped to content being slightly altered or deliberately manipulated. The team then mapped which combinations produce sound results that platforms can confidently show to people online, and which ones are so unreliable that they may cause more confusion than clarification.
The company’s chief scientific officer, Eric Horvitz, says the work was prompted by legislation—like California’s AI Transparency Act, which will take effect in August—and the speed at which AI has developed to combine video and voice with striking fidelity.
“You might call this self-regulation,” Horvitz told MIT Technology Review. But it’s clear he sees pursuing the work as boosting Microsoft’s image: “We’re also trying to be a selected, desired provider to people who want to know what’s going on in the world.”
Nevertheless, Horvitz declined to commit to Microsoft using its own recommendation across its platforms. The company sits at the center of a giant AI content ecosystem: It runs Copilot, which can generate images and text; it operates Azure, the cloud service through which customers can access OpenAI and other major AI models; it owns LinkedIn, one of the world’s largest professional platforms; and it holds a significant stake in OpenAI. But when asked about in-house implementation, Horvitz said in a statement, “Product groups and leaders across the company were involved in this study to inform product road maps and infrastructure, and our engineering teams are taking action on the report’s findings.”
It’s important to note that there are inherent limits to these tools; just as they would not tell you what your Rembrandt means, they are not built to determine if content is accurate or not. They only reveal if it has been manipulated. It’s a point that Horvitz says he has to make to lawmakers and others who are skeptical of Big Tech as an arbiter of fact.
“It’s not about making any decisions about what’s true and not true,” he said. “It’s about coming up with labels that just tell folks where stuff came from.”
Hany Farid, a professor at UC Berkeley who specializes in digital forensics but wasn’t involved in the Microsoft research, says that if the industry adopted the company’s blueprint, it would be meaningfully more difficult to deceive the public with manipulated content. Sophisticated individuals or governments can work to bypass such tools, he says, but the new standard could eliminate a significant portion of misleading material.
“I don’t think it solves the problem, but I think it takes a nice big chunk out of it,” he says.
Still, there are reasons to see Microsoft’s approach as an example of somewhat naïve techno-optimism. There is growing evidence that people are swayed by AI-generated content even when they know that it is false. And in a recent study of pro-Russian AI-generated videos about the war in Ukraine, comments pointing out that the videos were made with AI received far less engagement than comments treating them as genuine.
“Are there people who, no matter what you tell them, are going to believe what they believe?” Farid asks. “Yes.” But, he adds, “there are a vast majority of Americans and citizens around the world who I do think want to know the truth.”
That desire has not exactly led to urgent action from tech companies. Google started adding a watermark to content generated by its AI tools in 2023, which Farid says has been helpful in his investigations. Some platforms use C2PA, a provenance standard Microsoft helped launch in 2021. But the full suite of changes that Microsoft suggests, powerful as they are, might remain only suggestions if they threaten the business models of AI companies or social media platforms.
“If the Mark Zuckerbergs and the Elon Musks of the world think that putting ‘AI generated’ labels on something will reduce engagement, then of course they’re incentivized not to do it,” Farid says. Platforms like Meta and Google have already said they’d include labels for AI-generated content, but an audit conducted by Indicator last year found that only 30% of its test posts on Instagram, LinkedIn, Pinterest, TikTok, and YouTube were correctly labeled as AI-generated.
More forceful moves toward content verification might come from the many pieces of AI regulation pending around the world. The European Union’s AI Act, as well as proposed rules in India and elsewhere, would all compel AI companies to require some form of disclosure that a piece of content was generated with AI.
One priority from Microsoft is, unsurprisingly, to play a role in shaping these rules. The company waged a lobbying effort during the drafting of California’s AI Transparency Act, which Horvitz said made the legislation’s requirements on how tech companies must disclose AI-generated content “a bit more realistic.”
But another is a very real concern about what could happen if the rollout of such content-verification technology is done poorly. Lawmakers are demanding tools that can verify what’s real, but the tools are fragile. If labeling systems are rushed out, inconsistently applied, or frequently wrong, people could come to distrust them altogether, and the entire effort would backfire. That’s why the researchers argue that it may be better in some cases to show nothing at all than a verdict that could be wrong.
Inadequate tools could also create new avenues for what the researchers call sociotechnical attacks. Imagine that someone takes a real image of a fraught political event and uses an AI tool to change only an inconsequential share of pixels in the image. When it spreads online, it could be misleadingly classified by platforms as AI-manipulated. But combining provenance and watermark tools would mean platforms could clarify that the content was only partially AI generated, and point out where the changes were made.
California’s AI Transparency Act will be the first major test of these tools in the US, but enforcement could be challenged by President Trump’s executive order from late last year seeking to curtail state AI regulations that are “burdensome” to the industry. The administration has also generally taken a posture against efforts to curb disinformation, and last year, via DOGE, it canceled grants related to misinformation. And, of course, official government channels in the Trump administration have shared content manipulated with AI (MIT Technology Review reported that the Department of Homeland Security, for example, uses video generators from Google and Adobe to make content it shares with the public).
I asked Horvitz whether fake content from this source worries him as much as that coming from the rest of social media. He initially declined to comment, but then he said, “Governments have not been outside the sectors that have been behind various kinds of manipulative disinformation, and this is worldwide.”





















