OpenAI’s new superalignment
...
OpenAI's new superalignment team, which (over the next four years) will dedicate 20% of OpenAI’s compute resources to solving alignment challenges, will be co-led by Ilya Sutskever and Jan Leike. The team will focus on developing scalable training methods, validating alignment models, and conducting adversarial testing to ensure the AI systems align with human intent and do not go rogue.
Additionally, OpenAI is collaborating with industry leaders like Anthropic, Google, and Microsoft through the Frontier Model Forum. This initiative aims to advance AI safety research, identify best practices, and facilitate information sharing among policymakers, academia, and civil society. The Forum will focus on developing standardized evaluations and benchmarks for frontier AI models to ensure their responsible development and deployment. ...
July 12, 2024 issue https://shellypalmer.com/blog/
Will K-pop’s AI experiment pay off?
...
The music video features an AI-generated scene, and the record might well include AI-generated lyrics too. At the launch of the album in Seoul, one of the band members, Woozi, told reporters he was "experimenting" with AI when songwriting.
“We practised making songs with AI, as we want to develop along with technology rather than complain about it," he said.
...
Her worry though, is that a whole album of AI generated lyrics means fans will lose touch with their favourite musicians.
"I love it when music is a reflection of an artist and their emotions," she says. "K-pop artists are much more respected when they’re hands on with choreographing, lyric writing and composing, because you get a piece of their thoughts and feelings. ...
“What I've learned by hanging out in Seoul is that Koreans are big on innovation, and they're very big on ‘what's the next thing?’, and asking, ‘how can we be one step ahead?’ It really hit me when I was there,” he says.
“So, to me, it's no surprise that they're implementing AI in lyric writing, it's about keeping up with technology.” ...
See the full story here: https://www.bbc.com/news/articles/c4ngr3r0914o
AI’s understanding and reasoning skills can’t be assessed by current tests
PhilNote: this is a really nerdy paper on the various approaches that researchers are taking to determine whether and when an AI "understands" what it was doing. It goes into the flaws of each technique. The conclusion is that an 'understanding test' is a complex moving target that we may never fully solve. For me the most interesting and disturbing finding from one of their evaluations was "Surprisingly, when the researchers investigated the models’ answers at each sub-step, they found that even when the final answers were right, the underlying calculations and reasoning — the answers at each sub-step — could be completely wrong."
... But “AI surpassing humans on a benchmark that is named after a general ability is not the same as AI surpassing humans on that general ability,” computer scientist Melanie Mitchell pointed out in a May edition of her Substack newsletter. ...But “AI surpassing humans on a benchmark that is named after a general ability is not the same as AI surpassing humans on that general ability,” computer scientist Melanie Mitchell pointed out in a May edition of her Substack newsletter. ... But “AI surpassing humans on a benchmark that is named after a general ability is not the same as AI surpassing humans on that general ability,” computer scientist Melanie Mitchell pointed out in a May edition of her Substack newsletter. ...
The Winograd Schema Challenge, or WSC, was proposed in 2011 as a test for intelligent behavior of a system. Though many people are familiar with the Turing test as a way to evaluate intelligence, researchers had begun to propose modifications and alternatives that weren’t as subjective and didn’t require the AI to engage in deception to pass the test (SN: 6/15/12).
Instead of a free-form conversation, WSC features pairs of sentences that mention two entities and use a pronoun to refer to one of the entities. Here’s an example pair:
Sentence 1: In the storm, the tree fell down and crashed through the roof of my house. Now, I have to get it removed.
Sentence 2: In the storm, the tree fell down and crashed through the roof of my house. Now, I have to get it repaired.
A language model scores correctly if it can successfully match the pronoun (“it”) to the right entity (“the roof” or “the tree”). The sentences usually differ by a special word (“removed” or “repaired”) that when exchanged changes the answer. Presumably only a model that relies on commonsense world knowledge and not linguistic clues could provide the correct answers.
But it turns out that in WSC, there are statistical associations that offer clues. Consider the example above. Large language models, trained on huge amounts of text, would have encountered many more examples of a roof being repaired than a tree being repaired. A model might select the statistically more likely word among the two options rather than rely on any kind of commonsense reasoning. ...
For some researchers, the fact that LLMs are passing benchmarks so easily simply means that more comprehensive benchmarks need developing. For instance, researchers might turn to a collection of varied benchmark tasks that tackle different facets of common sense such as conceptual understanding or the ability to plan future scenarios. ...
But others are more skeptical that models performing great on the benchmarks necessarily possesses the cognitive abilities in question. If a model tests well on a dataset, it just tells us that it performs well on that particular dataset and nothing more, Elazar says. ...
Taking a different approach to testing
Systematically digging into the mechanisms required for understanding may offer more insight than benchmark tests, Arakelyan says. That might mean testing AI’s underlying grasp of concepts using what are called counterfactual tasks. In these cases, the model is presented with a twist on a commonplace rule that it is unlikely to have encountered in training, say an alphabet with some of the letters mixed up, and asked to solve problems using the new rule. ...
To try to get a better sense of language understanding, the team compared how a model answered the standard test with how it answered when given the same premise sentence but with slightly paraphrased hypothesis sentences. A model with true language understanding, the researchers say, would make the same decisions as long as the slight alteration preserves the original meaning and logical relationships. ...
But for a sizable number of sentences, the models tested changed their decision, sometimes even switching from “implies” to “contradicts.” When the researchers used sentences that did not appear in the training data, the LLMs changed as many as 58 percent of their decisions.
“This essentially means that models are very finicky when understanding meaning,” Arakelyan says. This type of framework, unlike benchmark datasets, can better reveal whether a model has true understanding or whether it is relying on clues like the distribution of the words. ...
Surprisingly, when the researchers investigated the models’ answers at each sub-step, they found that even when the final answers were right, the underlying calculations and reasoning — the answers at each sub-step — could be completely wrong. This confirms that the model sometimes relies on memorization, Dziri says. Though the answer might be right, it doesn’t say anything about the LLM’s ability to generalize to harder problems of the same nature — a key part of true understanding or reasoning. ...
In truth, a perfect AI evaluation might never exist. The more language models improve, the harder tests will have to get to provide any meaningful assessment. ...
See the full paper here: https://www.sciencenews.org/article/ai-understanding-reasoning-skill-assess
Stop Trying to Sell Gamers What They Don’t Want
...
Every year, the Game Developers Conference in San Francisco has a main floor filled with every manner of game development tools and platforms. Among these are inevitably Web3 companies trying to explain to an audience of completely uninterested Web2 developers how easy it is to tokenize in-game items and how many transactions per second their layer 2 scaling solution has. But never once has a single of these developers heard from their communities “I really wish my magic spells were NFTs.” ...
Video game players do not care about NFTs. They don’t care about ownership of in-game assets. They don’t care about faster and cheaper blockchains. They fundamentally don’t care at all about the underlying tech stack behind the games they play. ...
Since Web3 developers care about decentralization and ownership of assets, they mistakenly believe everyone else must as well. But the average gamer doesn’t have a clue they don’t “own” the video game skins or items they have bought or received through gameplay. ...
Then, what do gamers care about? Well, playing an enjoyable game goes without saying. Beyond that, they actually care about many of the same things that those in Web3 care about. The most important of these being community, data, and extensibility. Fortunately, these are all problems Web3 technologies are perfectly set up to solve. ...
One only needs to glance at things like esports tournaments or speed-running stats to see how important data is for players. ... Developers use analytics for in-game activity like balancing or tracking player behaviors, but also for finding what demographics and marketplaces are best for advertising and selling. ..
Finally we have extensibility, or the ability of these systems to be expanded upon. Gamers adore user-generated content. They love every aspect of it, from community tournaments to fan art, to custom maps to secondary marketplaces. ...
See the full story here: https://www.coindesk.com/consensus-magazine/2024/07/10/stop-trying-to-sell-gamers-what-they-dont-want/
Taiwan central bank says no timetable for launching digital currency
... Taiwan's central bank has been working on a pilot for a government-run digital currency, to allow people to use a digital wallet and make payments without using a debit or credit card.
"Although the bank currently has no timetable for issuing central bank digital currency, in the process of continuous research and experimentation it is already improving the processing efficiency and innovative application of the payment system," it said in a report to parliament. ...
See the full story here: https://www.digitalnationaus.com.au/news/taiwan-central-bank-says-no-timetable-for-launching-digital-currency-609529
Why Bill Gates says AI Superintelligence requires some self-awareness
...
Reporting on and writing about AI has given me a whole new appreciation of how flat-out amazing our human brains are. While large language models (LLMs) are impressive, they lack whole dimensions of thought that we humans take for granted. Bill Gates hit on this idea last week on the Next Big Idea Club podcast. Speaking to host Rufus Griscom, Gates talked at length about “metacognition,” which refers to a system that can think about its own thinking. Gate defined metacognition as the ability to “think about a problem in a broad sense and step back and say, Okay, how important is this to answer? How could I check my answer, and what external tools would help me with this?”
The Microsoft founder said the overall “cognitive strategy” of existing LLMs like GPT-4 or Llama was still lacking in sophistication. “It’s just generating through constant computation each token and sequence, and it’s mind-blowing that that works at all,” Gates said. “It does not step back like a human and think, Okay, I’m gonna write this paper and here’s what I want to cover; okay, I’ll put some text in here, and here’s what I want to do for the summary.” ...
HOW THE SUPREME COURT’S LANDMARK CHEVRON RULING WILL AFFECT TECH AND AI
... As Axios’s Scott Rosenberg points out, the removal of the Chevron Doctrine may make passing meaningful federal AI regulation much harder. Chevron allowed Congress to define regulations as sets of general directives, and left it to the experts at the agencies to define the specific rules and settle disputes on a case-by-case basis at the implementation and enforcement level. Now, it’ll be on Congress to hash out the fine points of the law in advance, doing their best to anticipate disputes that might arise in the future. And that might be especially difficult with a young and fast-moving industry like AI. ...
But there’s no guarantee that the courts will rise to the challenge. Just look at the high court’s decision to effectively punt on the constitutionality of Texas and Florida regulations governing social networks’ content moderation. “Their unwillingness to resolve such disputes over social media—a well-established technology—is troubling given the rise of AI, which may present even thornier legal and Constitutional questions,” Mercatus Center AI researcher Dean Ball points out. ...
See the full story here: https://www.fastcompany.com/91150606/bill-gates-ai-superintelligence
The Future of AR Beyond the Vision Pro Is Already Brewing
I recently flew out to Long Beach, California, for the AWE augmented and virtual reality conference, but I left my mixed reality VR devices — the Apple Vision Pro and Meta Quest 3 — back in New Jersey. Instead I took two pairs of smart glasses: Meta's Ray-Bans and Xreal's Air 2 Pro. I took photos and made calls with the Ray-Bans. I watched movies on the plane with Xreal. And I didn't miss those chunky VR goggles one bit. ...
These gadgets don't offer up anything like the full-fledged mixed reality that can happen in the Vision Pro or Quest 3, but their increasing utility points to a future of augmented reality beyond bulky headsets. ...
Meanwhile, AWE reminded me that better lenses, displays and hand tracking are coming but still face real challenges. How will future glasses offload all their processing? What about the battery? ...
The Meta Quest 3 and Vision Pro were scattered everywhere around AWE's expo floor in plenty of peripheral and software demos. That's because they both support hand tracking, and they combine camera feeds of the real world with overlays of virtual graphics to mix reality surprisingly well. ...
Ultraleap, a company that already has hand-tracking technology on existing VR and AR headsets, is testing a smaller, more power-efficient event camera technology — which only senses rough changes in light and movement as opposed to specific details — that could last for hours on smaller glasses while looking for hand micro gestures, similar to what the Apple Vision Pro does with more power-hungry infrared. ...
See the full story here: https://www.cnet.com/tech/computing/the-future-of-ar-beyond-vision-pro-is-already-brewing/
The US intelligence community is embracing generative AI
...
As the functional manager for the intelligence community’s open-source data collection, Raman said the CIA is turning to generative AI to keep pace with, for example, “all of the news stories that come in every minute of every day from around the world.” AI, Raman said, helps intelligence analysts comb through vast amounts of data to pull out insights that can inform policymakers. In a giant haystack, AI helps pinpoint the needle. ...
In May, Microsoft announced the availability of GPT-4 for users of its Azure Government Top Secret cloud, which includes defense and intelligence customers. Through the air-gapped solution, customers in the classified space can make use of a tool very similar to what’s used in the commercial space. Microsoft officials noted security accreditation took 18 months, indicative of how complex software security vetting at the highest levels can be even for tech giants. ...
Amodei said that while Anthropic is responsible for the security of the large language model, it partnered with AWS because of its superior cloud security standards and reputation as a public sector leader in the cloud computing space. Amodei said the classified marketplace, which allows government customers to spin up and try software before they buy it, also simplifies procurement for the government. And, he said, it gives intelligence agencies the means to use the same tools available to adversaries.
“The [Intelligence Community Marketplace] makes it easier, because AWS has worked with this many times, and so we don’t have to reinvent the wheel,” Amodei said. “AI needs to empower democracies and allow them to function better and remain competitive on the global stage.”
See the full story here; https://www.nextgov.com/artificial-intelligence/2024/07/us-intelligence-community-embracing-generative-ai/397849/
Beyond sight: Comparing traditional virtual reality and immersive multi-sensory environments in stress reduction of university students
...
Results: The findings suggest that participants’ experiences in both VR and IME environments effectively contributed to reducing anxiety levels and fostering a tranquil atmosphere. Both experimental groups reported a significantly heightened sense of relaxation post-experiments. Although the disparity was not statistically significant, the IME group displayed a more pronounced reduction in stress levels compared to the VR group.
...
See the full study here: https://www.frontiersin.org/journals/virtual-reality/articles/10.3389/frvir.2024.1412297/full
As Apple and OpenAI Grow Partnership, Studios Stand on Sidelines of AI Battle
OpenAI is partnering with Apple to give the iPhone maker a role on its board, Bloomberg reported, affording the Sam Altman-led firm another foothold into Hollywood as the industry grapples artificial intelligence tools that have the potential to upend production, along with livelihoods of creators who’re concerned about being replaced by the tech.
As part of a seismic agreement announced last month, head of Apple App Store and former marketing chief Phil Schiller will assume the so-called “observer” position, according to Bloomberg. Under the pact, he’ll be able to attend board meetings and gain a glimpse into company operations — part of which involves courting Hollywood to adopt its products — but won’t be allowed to vote. ...
In response to the Copyright Office exploring policy questions surrounding the intersection of AI and intellectual property, the MPA landed on opposite sides of several hot-button issues with SAG-AFTRA, the Writers Guild of America and Directors Guild of America. Joined by OpenAI, Meta and tech advocacy groups, the MPA diverged with the unions on whether new legislation is warranted to address the unauthorized use of copyrighted material to train AI systems and the mass generation of potentially infringing works based on existing content. The group maintained that existing intellectual property laws are sufficient to address thorny legal issues posed by the technology. This stood in contrast to SAG-AFTRA’s call for a federal right of publicity law that would protect members’ rights to profit off of their images, voices and likenesses. ...
See the full story here: https://www.hollywoodreporter.com/business/business-news/apple-openai-studios-1235938049/
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Tufts Alumni Bio