Meta is expanding the labelling of AI-generated imagery on its social media platforms, Facebook, Instagram and Threads, to cover some synthetic imagery that’s been created using rivals’ generative AI tools — not less than where rivals are using what it couches as “industry standard indicators” that the content is AI-generated and which Meta is in a position to detect.
The event means the social media giant expects to be labelling more AI-generated imagery circulating on its platforms going forward. However it’s also not putting figures on any of these things — i.e. how much synthetic vs authentic content is routinely being pushed at users — so how significant a move this is perhaps within the fight against AI-fuelled dis- and misinformation (in a large 12 months for elections, globally) is unclear.
Meta says it already detects and labels “photorealistic images” which have been created with its own “Imagine with Meta” generative AI tool, which launched last December. But, thus far, it hasn’t been labelling synthetic imagery created using other company’s tools. So that is the (baby) step it’s announcing today.
“[W]e’ve been working with industry partners to align on common technical standards that signal when a bit of content has been created using AI,” wrote Meta president, Nick Clegg, in a blog post announcing the expansion of labelling. “Having the ability to detect these signals will make it possible for us to label AI-generated images that users post to Facebook, Instagram and Threads.”
Per Clegg, Meta can be rolling out expanded labelling “in the approaching months”; and applying labels in “all languages supported by each app”.
A spokesman for Meta couldn’t provide a more specific timeline; nor any details on which orders markets can be getting the additional labels once we asked for more. But Clegg’s post suggests the rollout can be gradual — “through the subsequent 12 months” — and will see Meta specializing in election calendars world wide to tell decisions about when and where to launch the expanded labelling in several markets.
“We’re taking this approach through the subsequent 12 months, during which quite a few essential elections are happening world wide,” he wrote. “During this time, we expect to learn way more about how individuals are creating and sharing AI content, what kind of transparency people find most precious, and the way these technologies evolve. What we learn will inform industry best practices and our own approach going forward.”
Meta’s approach to labelling AI generated imagery relies upon detection powered by each visible marks which might be applied to synthetic images by its generative AI tech and “invisible watermarks” and metadata the tool also embeds with file images. It’s these same kinds of signals, embedded by rivals’ AI image-generating tools, that Meta’s detection tech can be searching for, per Clegg — who notes it’s been working with other AI corporations, via forums just like the Partnership on AI, with the aim of developing common standards and best practices for identifying generative AI.
His blog post doesn’t spell out the extent of others’ efforts towards this end. But Clegg implies Meta will — in the approaching 12 months — find a way to detect AI generated imagery from tools made by Google, OpenAI, Microsoft, Adobe, Midjourney, and Shutterstock, in addition to its own AI image tools.
What about AI-generated video and audio?
On the subject of AI generated videos and audio, Clegg suggests it’s generally still too difficult to detect these type of fakes — because marking and watermarking has yet to be adopted at enough scale for detection tools to do a superb job. Moreover, such signals might be stripped out, through editing and further media manipulation.
“[I]t’s not yet possible to discover all AI-generated content, and there are methods that folks can strip out invisible markers. So we’re pursuing a spread of options,” he wrote. “We’re working hard to develop classifiers that might help us to robotically detect AI-generated content, even when the content lacks invisible markers. At the identical time, we’re searching for ways to make it harder to remove or alter invisible watermarks.
“For instance, Meta’s AI Research lab FAIR recently shared research on an invisible watermarking technology we’re developing called Stable Signature. This integrates the watermarking mechanism directly into the image generation process for some kinds of image generators, which may very well be beneficial for open source models so the watermarking can’t be disabled.”
Given the gap between what’s technically possible on the AI generation vs detection side, Meta is changing its policy to require users who post “photorealistic” AI generated video or “realistic-sounding” audio to tell it that the content is synthetic — and Clegg says it’s reserving the appropriate to label the content if it deems it “particularly high risk of materially deceiving the general public on a matter of importance”.
If the user fails to make this manual disclosure they might face penalties — under Meta’s existing Community Standards. (So account suspensions, bans etc.)
“Our Community Standards apply to everyone, all world wide and to every type of content, including AI-generated content,” Meta’s spokesman told us when asked what variety of sanctions users who fail to make a disclosure could face.
While Meta is keenly heaping attention on the risks around AI-generated fakes, it’s price remembering that manipulation of digital media is nothing latest and misleading people at scale doesn’t require fancy generative AI tools. Access to a social media account and more basic media editing skills are all it could actually take to make a fake that goes viral.
On this front, a recent decision by the Oversight Board, a Meta-established content review body — which checked out its decision to not remove an edited video of president Biden together with his granddaughter which had been manipulated to falsely suggest inappropriate touching — urged the tech giant to rewrite what it described as “incoherent” policies on the subject of faked videos. The Board specifically called out Meta’s give attention to AI generated content on this context.
“Because it stands, the policy makes little sense,” wrote Oversight Board co-chair Michael McConnell. “It bans altered videos that show people saying things they don’t say, but doesn’t prohibit posts depicting a person doing something they didn’t do. It only applies to video created through AI, but lets other fake content off the hook.”
Asked whether, in light of the Board’s review, Meta is looking at expanding its policies to make sure non-AI-related content manipulation risks should not being ignored, its spokesman declined to reply, saying only: “Our response to this decision can be shared on our transparency centre throughout the 60 day window.”
LLMs as a content moderation tool
Clegg’s blog post also discusses the (to this point “limited”) use of generative AI by Meta as a tool for helping it implement its own policies — and the potential for GenAI to take up more of the slack here, with the Meta president suggesting it might turn to large language models (LLMs) to support its enforcement efforts during moments of “heightened risk”, similar to elections.
“While we use AI technology to assist implement our policies, our use of generative AI tools for this purpose has been limited. But we’re optimistic that generative AI could help us take down harmful content faster and more accurately. It is also useful in enforcing our policies during moments of heightened risk, like elections,” he wrote.
“We’ve began testing Large Language Models (LLMs) by training them on our Community Standards to assist determine whether a bit of content violates our policies. These initial tests suggest the LLMs can perform higher than existing machine learning models. We’re also using LLMs to remove content from review queues in certain circumstances once we’re highly confident it doesn’t violate our policies. This frees up capability for our reviewers to give attention to content that’s more prone to break our rules.”
So we now have Meta experimenting with generative AI as a complement to its standard AI-powered content moderation efforts in a bid to scale back the quantity of toxic content that gets pumped into the eyeballs and brains of overworked human content reviewers, with all of the trauma risks that entails.
AI alone couldn’t fix Meta’s content moderation problem — whether AI plus GenAI can do it seems doubtful. However it might help the tech giant extract greater efficiencies at a time when the tactic of outsourcing toxic content moderation to low paid humans is facing legal challenges across multiple markets.
Clegg’s post also notes that AI-generated content on Meta’s platforms is “eligible to be fact-checked by our independent fact-checking partners” — and should, due to this fact, even be labelled as debunked (i.e. along with being labelled as AI-generated; or “Imagined by AI”, as Meta’s current GenAI image labels have it). Which, frankly, sounds increasingly confusing for users attempting to navigate the credibility of stuff they see on its social media platforms — where a bit of content may get multiple signposts applied to it, only one label, or none in any respect.
Clegg also avoids any discussion of the chronic asymmetry between the supply of human fact-checkers, a resource that’s typically provided by non-profit entities which have limited money and time to debunk essentially limitless digital fakes; and all kinds of malicious actors with access to social media platforms, fuelled by myriad incentives and funders, who’re in a position to weaponize increasingly widely available and powerful AI tools (including those Meta itself is constructing and providing to fuel its content-dependent business) to massively scale disinformation threats.
Without solid data on the prevalence of synthetic vs authentic content on Meta’s platforms, and without data on how effective its AI fake detection systems actually are, there’s little we will conclude — beyond the plain: Meta is feeling under pressure to be seen to be doing something in a 12 months when election-related fakes will, undoubtedly, command a whole lot of publicity.