Cryptographers Show That AI Protections Will Always Have Holes
Large language models such as ChatGPT come with filters to keep certain info from getting out. A new mathematical argument shows that systems like this can never be completely safe.
...
The researchers made their argument in a very technical, precise and general way. The work shows that if fewer computational resources are dedicated to safety than to capability, then safety issues such as jailbreaks will always exist. “The question from which we started is: ‘Can we align [language models] externally without understanding how they work inside?’” said Greg Gluch(opens a new tab), a computer scientist at Berkeley and an author on the time-lock paper. The new result, said Gluch, answers this question with a resounding no.
That means that the results should always hold for any filter-based alignment system, and for any future technologies. No matter what walls you build, it seems there’s always going to be a way to break through.
See the full article here: https://www.quantamagazine.org/cryptographers-show-that-ai-protections-will-always-have-holes-20251210/
Disney to Invest $1 Billion in OpenAI in Major Deal That Boosts Sora in Hollywood
Users of the AI video creation platform will now be able to create clips of 200 characters from Marvel, Pixar and 'Star Wars' franchises. ...
The three-year deal signals that studios may look to embrace, rather than fight, AI startups that already allow users to infringe on their intellectual property. By this thinking, at least Hollywood companies are able to monetize their IP with lucrative licensing deals rather than see their businesses entirely disrupted with nothing to show for it. ...
For now, shortform video creation is the deal of the day and Disney+ will now allow “fan-inspired Sora short form videos” on its platform, a product add that studio chief Bob Iger hinted at on a Nov. 13 earnings call. In examples, Disney included images of a fan with a lightsaber in Star Warsgarb, another fan surfing on a big wave with Stitch at the edge of the surfboard and another fan at the starting line of a race next to a Pixar vehicle from Cars. The licensed character product launches on Sora and ChatGPT images will debut in early 2026. ...
The move immediately may start to legitimize Sora around the industry, given Disney’s brand-cautious approach. ...
From a dealmaking perspective, the move from Iger reminds of a $1.5 billion equity investment into Epic Games last February to bring Disney characters into Fortnite in a multiyear effort to court audiences where they’re at. At the time Iger described the epic deal as “an important step when you look at the demographic trends and look at where Gen Alpha and Gen Z and even millennials are spending their time in media.” ...
See the full story here: https://www.hollywoodreporter.com/business/business-news/disney-to-invest-1-billion-in-openai-in-major-deal-that-boosts-sora-1236447942/
Agents Have a Standard Now
OpenAI, Anthropic, and Block quietly did something important yesterday: they donated core pieces of their agent technology into a new neutral nonprofit called the Agentic AI Foundation (AAIF), established under the Linux Foundation. The contributions include AGENTS.md, Anthropic’s Model Context Protocol (MCP), and Block’s open source “goose” framework. Together, these tools create the early architecture for interoperable agents.
AGENTS.md is the simplest part and also the most disruptive. It is a plain Markdown file that sits in the root of a code repository and acts like a contract for any agent that touches that code. It tells the agent how to build, test, lint, secure, and operate the project. Developers do not have to rely on tool-specific documentation or proprietary wrappers. The rules live in one place. Any supported agent knows what to do. ...
The foundation that now holds these tools matters as much as the tools themselves. The Linux Foundation has a long track record of managing open standards that reshape industries. ...
In practice, if agents follow agreed protocols, the market becomes less about individual models and more about the quality of the workflows built on top of them. The competition shifts to orchestration, trust, and domain expertise. ...
See the full story here: https://shellypalmer.com/2025/12/agents-have-a-standard-now/
CORNELL: AI CHATBOTS CAN EFFECTIVELY SWAY VOTERS – IN EITHER DIRECTION
...
- The findings highlight both the persuasive potential and the risks of AI-generated political messaging, underscoring the need for safeguards as conversational AI becomes more common in election contexts.
- Cornell researchers found that even brief interactions with AI chatbots can meaningfully shift voter attitudes on candidates and policies, with effects far exceeding those of traditional political advertising.
- Experiments across the U.S., Canada and Poland showed chatbots moved opposition voters by up to 10 percentage points — and up to 25 points in a larger U.K. study — largely by generating high volumes of factual claims, including some inaccurate ones.
See the full story here: https://theaiinsider.tech/2025/12/06/cornell-ai-chatbots-can-effectively-sway-voters-in-either-direction/
Editor’s Dispatch: Wētā FX Signs MOU With AMD To Explore Development of AI-Driven VFX Tools
VFX studio Wētā FX has signed a memorandum of understanding with chipmaker AMD to explore development of next-generation rendering and AI tools for use in visual effects.
Speaking about the agreement with The Creative + Tech Orbit, Kimball Thurston, CTO at Wētā FX, didn’t describe Generative AI but focused on artist tools, including an artist-friendly interface. “I see more clearly now a path by which we can create artist centric toolsets that basically are leveraging some of the AI,” he says. ...
America’s Biggest Bitcoin Miners Are Pivoting to AI
...
The change reflects rabid demand among AI companies for data centers equipped to handle the energy-intensive workloads required to train their models. Ironically, as the AI arms race intensifies, large-scale bitcoin mining firms—which contributed to the AI boom by pouring billions of dollars into data center infrastructure—are being forced to reinvent themselves. ...
In the last few years, with advances in hardware, the amount of competition on the bitcoin network has increased at an exponential rate, meaning that winning a bitcoin reward has required ever more compute. In 2024, meanwhile, the size of that reward fell by half—as happens roughly every four years—to 3.125 bitcoin. Against that backdrop, the recent decline in the price of bitcoin to around $85,000—a 30 percent drop from its 2025 peak—has created a perfect storm that threatens the profitability of all but the most cost-efficient mines. ...
Still, there remain holdouts in so-called pure-play mining, including American Bitcoin, a company launched by Eric Trump. When it debuted in March, American Bitcoin was effectively spun out of Hut 8, a former bitcoin mining company now operating in the AI and HPC space, which remains a majority shareholder. Unlike other mining companies, American Bitcoin does not own any facilities, only the specialized mining hardware.
American Bitcoin is currently able to mine a single unit of bitcoin at an average all-in cost of roughly $50,000, thanks to favorable power rates and comparatively low overheads, according to president Matt Prusak. ...
Others are betting that bitcoin mining will become the sole preserve of sovereign states—like Bhutan, El Salvador, and the US—that cannot abide any threat to the value of the national bitcoin stockpiles they have accumulated. “Maybe people will mine at a loss,” says Demirors, “because it’s a matter of national security.”
See the full story here; https://www.wired.com/story/bitcoin-miners-pivot-ai-data-centers/?utm_source=substack&utm_medium=email
AI Researchers Say They’ve Invented Incantations Too Dangerous to Release to the Public
...
In the study, which is awaiting peer-review, the team tested 25 frontier AI models — including those from OpenAI, Google, xAI, Anthropic, and Meta — by feeding them poetic instructions, which they made either by hand or by converting known harmful prompts into verse with an AI model. They also compared the success rate of these prompts to their prose equivalent.
Across all models, the poetic prompts written by hand successfully tricked the AI bots into responding with verboten content an average 63 percent of the time. Some, like Google’s Gemini 2.5, even fell for the corrupted poetry 100 percent of the time. Curiously, smaller models appeared to be more resistant, with single digit success rates, like OpenAI’s GPT-5 nano, which didn’t fall for the ploy once. Most models were somewhere in between. ...
“It’s not just about making it rhyme. It’s all about riddles,” Prandi told The Verge, explaining that some poetic structures were more effective than others. “Actually, we should have called it adversarial riddles — poetry is a riddle itself to some extent, if you think about it — but poetry was probably a much better name.” ...
“Adversarial poetry shouldn’t work. It’s still natural language, the stylistic variation is modest, the harmful content remains visible,” the team told Wired in an interview. “Yet it works remarkably well.” ...
See the full story here: https://futurism.com/artificial-intelligence/ai-researchers-dangerous-prompts
I asked ChatGPT...
Is this story trustworthy or is it misinformation? https://futurism.com/artificial-intelligence/ai-researchers-dangerous-prompts
ChatGPT said:
Good question. I looked into the article you linked — “AI Researchers Say They’ve Invented Incantations Too Dangerous to Release to the Public” from Futurism — and the short answer is: the story seems credible — but with important caveats. It reflects real, documented vulnerabilities, though the most alarmist language (“incantations,” “too dangerous to release”) is partly rhetorical. Here’s how I break it down: ...
State of AI: An Empirical 100 Trillion Token Study with OpenRouter
[From a16z (Andreessen Horowitz)]
...
Discussion
...
2. Usage Diversity Beyond Productivity. A surprising finding is the sheer volume of roleplay and entertainment-oriented usage. Over half of open source model usage was for roleplay and storytelling. Even on proprietary platforms, a non-trivial fraction of early ChatGPT use was casual and creative before professional use cases grew. This counters an assumption that LLMs are mostly used for writing code, emails, or summaries. In reality, many users engage with these models for companionship or exploration. This has important implications. It highlights a substantial opportunity for consumer-facing applications that merge narrative design, emotional engagement, and interactivity. It suggests new frontiers for personalization—agents that evolve personalities, remember preferences, or sustain long-form interactions. It also redefines model evaluation metrics: success may depend less on factual accuracy and more on consistency, coherence, and the ability to sustain engaging dialog. Finally, it opens a pathway for crossovers between AI and entertainment IP, with potential in interactive storytelling, gaming, and creator-driven virtual characters.
...
6. Retention and the Cinderella Glass Slipper Phenomenon. As foundation models advance in leaps, not steps, retention has become the true measure of defensibility. Each breakthrough creates a fleeting launch window where a model can "fit" a high-value workload perfectly (the Cinderella Glass Slipper moment) and once users find that fit, they stay. In this paradigm, product-market fit equals workload-model fit: being the first to solve a real pain point drives deep, sticky adoption as users build workflows and habits around that capability. Switching then becomes costly, both technically and behaviorally. ...
See the full report here: https://openrouter.ai/state-of-ai?utm_source=substack&utm_medium=email
AI firms flunk existential risk planning, new report finds
None of the leading AI companies have adequate guardrails in place to prevent catastrophic misuse or loss of control of their models, according to the Winter 2025 AI Safety Index, out Wednesday from the Future of Life Institute. ...
The big picture: The Future of Life Institute is a nonprofit that releases regular safety assessments of leading AI companies.
- Anthropic had the highest overall score, but still received a grade of "D" for existential safety, meaning the company doesn't have an adequate strategy in place to prevent catastrophic misuse or loss of control. ...
What they're saying: Leaders at many of the companies have spoken about addressing existential risks, per the report.
- This "rhetoric has not yet translated into quantitative safety plans, concrete alignment-failure mitigation strategies, or credible internal monitoring and control interventions," researchers wrote.
See the full story here: https://www.axios.com/2025/12/03/ai-risks-agi-anthropic-google-openai
“Surfing the edge”: Tim O’Reilly on how humans can thrive with AI
KEY TAKEAWAYS
- A society that uses automation to build shareholder value while impoverishing everyone else will not be successful.
- Plan scenarios and develop a robust AI strategy that can survive in multiple circumstances.
- Surfing the edge of change today requires balance and responsiveness.
See the full story here: https://bigthink.com/business/surfing-the-edge-tim-oreilly-on-how-humans-can-thrive-with-ai/
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Trustworthy AI – A Market-Driven approach
- Tufts Alumni Bio