philip lelyveld The world of entertainment technology

11Apr/25Off

MIT study finds that AI doesn’t, in fact, have values

... The co-authors of the MIT study say their work suggests that “aligning” AI systems — that is, ensuring models behave in desirable, dependable ways — could be more challenging than is often assumed.

“One thing that we can be certain about is that models don’t obey [lots of] stability, extrapolability, and steerability assumptions,” Stephen Casper, a doctoral student at MIT and a co-author of the study, told TechCrunch. “It’s perfectly legitimate to point out that a model under certain conditions expresses preferences consistent with a certain set of principles. The problems mostly arise when we try to make claims about the models, opinions, or preferences in general based on narrow experiments.” ...

According to the co-authors, none of the models was consistent in its preferences. Depending on how prompts were worded and framed, they adopted wildly different viewpoints. ...

“A model cannot ‘oppose’ a change in its values, for example — that is us projecting onto a system,” Cook said. “Anyone anthropomorphizing AI systems to this degree is either playing for attention or seriously misunderstanding their relationship with AI … Is an AI system optimizing for its goals, or is it ‘acquiring its own values’? It’s a matter of how you describe it, and how flowery the language you want to use regarding it is.”

See the full story here; https://techcrunch.com/2025/04/09/mit-study-finds-that-ai-doesnt-in-fact-have-values/

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.