Cryptography may offer a solution to the massive AI-labeling problem
... There’s a big problem, though: identifying material that was created by artificial intelligence is a massive technical challenge. The best options currently available—detection tools powered by AI, and watermarking—are inconsistent, impermanent, and sometimes inaccurate. (In fact, just this week OpenAI shuttered its own AI-detecting tool because of high error rates.) ...
But another approach has been attracting attention lately: C2PA. Launched two years ago, it’s an open-source internet protocol that relies on cryptography to encode details about the origins of a piece of content, or what technologists refer to as “provenance” information.
The developers of C2PA often compare the protocol to a nutrition label, but one that says where content came from and who—or what—created it. ...
The project, part of the nonprofit Joint Development Foundation, was started by Adobe, Arm, Intel, Microsoft, and Truepic, which formed the Coalition for Content Provenance and Authenticity (from which C2PA gets its name). Over 1,500 companies are now involved in the project through the closely affiliated open-source community, Content Authenticity Initiative (CAI), including ones as varied and prominent as Nikon, the BBC, and Sony. ...
The major media platform Shutterstock has joined as a member and announced its intention to use the protocol to label all its AI-generated content, including its DALL-E-powered AI image generator. ...
What is C2PA and how is it being used?
Microsoft, Intel, Adobe, and other major tech companies started working on C2PA in February 2021, hoping to create a universal internet protocol that would allow content creators to opt in to labeling their visual and audio content with information about where it came from. (At least for the moment, this does not apply to text-based posts.)
Crucially, the project is designed to be adaptable and functional across the internet, and the base computer code is accessible and free to anyone. ...
More specifically, it works by encoding provenance information through a set of hashes that cryptographically bind to each pixel, says Jenks, who also leads Microsoft’s work on C2PA. ...
C2PA offers some critical benefits over AI detection systems, which use AI to spot AI-generated content and can in turn learn to get better at evading detection. It’s also a more standardized and, in some instances, more easily viewable system than watermarking, the other prominent technique used to identify AI-generated content. The protocol can work alongside watermarking and AI detection tools as well, says Jenks. ...
That said, provenance information is far from a fix-all solution. C2PA is not legally binding, and without required internet-wide adoption of the standard, unlabeled AI-generated content will exist, says Siwei Lyu, a director of the Center for Information Integrity and professor at the University at Buffalo in New York. “The lack of over-board binding power makes intrinsic loopholes in this effort,” he says, though he emphasizes that the project is nevertheless important. ...
What’s more, since C2PA relies on creators to opt in, the protocol doesn’t really address the problem of bad actors using AI-generated content. And it’s not yet clear just how helpful the provision of metadata will be when it comes to media fluency of the public. Provenance labels do not necessarily mention whether the content is true or accurate. ...
Ultimately, the coalition’s most significant challenge may be encouraging widespread adoption across the internet ecosystem, especially by social media platforms. ...
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Tufts Alumni Bio