Cryptographers Show That AI Protections Will Always Have Holes

11Dec/25Off

Cryptographers Show That AI Protections Will Always Have Holes

Large language models such as ChatGPT come with filters to keep certain info from getting out. A new mathematical argument shows that systems like this can never be completely safe.

...

The researchers made their argument in a very technical, precise and general way. The work shows that if fewer computational resources are dedicated to safety than to capability, then safety issues such as jailbreaks will always exist. “The question from which we started is: ‘Can we align [language models] externally without understanding how they work inside?’” said Greg Gluch(opens a new tab), a computer scientist at Berkeley and an author on the time-lock paper. The new result, said Gluch, answers this question with a resounding no.

That means that the results should always hold for any filter-based alignment system, and for any future technologies. No matter what walls you build, it seems there’s always going to be a way to break through.

See the full article here: https://www.quantamagazine.org/cryptographers-show-that-ai-protections-will-always-have-holes-20251210/

Cryptographers Show That AI Protections Will Always Have Holes

Filed under: Non-3D stories Comments Off

Comments (0) Trackbacks (0) ( subscribe to comments on this post )

Sorry, the comment form is closed at this time.

Trackbacks are disabled.

Fortnite meets Hunger Games: There are no humans in this YouTube reality show » « Disney to Invest $1 Billion in OpenAI in Major Deal That Boosts Sora in Hollywood

Pages

If your company is an ETC member, you can log in and see more news posts at www.etcentric.org

philip lelyveld The world of entertainment technology

Cryptographers Show That AI Protections Will Always Have Holes

Pages

More posts