Stability AI Intros Real-Time Text-to-Image Generation Model
Stability AI, developer of Stable Diffusion (one of the leading visual content generators, alongside Midjourney and DALL-E), has introduced SDXL Turbo — a new AI model that demonstrates more of the latent possibilities of the common diffusion generation approach: images that update in real time as the user’s prompt updates. This feature was always a possibility even with previous diffusion models given text and images are comprehended differently across linear time, but increased efficiency of generation algorithms and the steady accretion of GPUs and TPUs in a developer’s data center makes the experience more magical.
Write “A cat…” and see a cat. Add “… in a top hat,” and see the hat materialize. Finish the sentence “A catastrophic plane crash” instead and see the image change to that, too.
In the case of SDXL Turbo, in four steps the model can produce images that humans evaluate as matching a prompt better than images that the OpenMUSE model took 26 steps to produce.
About 70 percent of humans rate a 1-step image produced by SDXL Turbo as subjectively better than a 16-step image produced by OpenMUSE, and their data shows similar advantages over their own prior models. The advantage here is clearly time and energy savings (Stability AI claims the model can produce a 512×512 image in just 207ms on an A100 GPU).
According to Stability AI’s CEO Emad Mostaque, there’s a penalty from the approach, but one with a significant upside: “Less diversity, but way faster & more variants to come which will be… interesting, particularly with upscales & more,” he posted on X. Hardcore prompt warriors may not hate this drop in diversity, given the appreach clearly enables a category of application and workflow previously impossible.
The new model introduces what Stability AI calls ‘distillation’ techniques. The company describes the approach as “Adversarial Distilled Diffusion” or ADD, noting it shares similarities with GANs (Generative Adversarial Networks), which run with fewer steps and typically encode less semantic information, and are not typically prompted in natural language like our current generators.
Stability AI has published model weights and code on Hugging Face for non-commercial use, and link to a beta tool on Clipdrop so interested people can play with the model themselves.
Related:
Stability AI Turbocharges Text-to-Image Generation with SDXL Turbo, VentureBeat, 11/29/23
Stable Diffusion XL Turbo Can Generate AI Images as Fast as You Can Type, Ars Technica, 11/29/23
Stability Introduces GenAI Video Model: Stable Video Diffusion, ETCentric, 11/27/23
See the original story here: https://www.etcentric.org/stability-ai-intros-real-time-text-to-image-generation-model/
I’m watching ‘AI upscaled’ Star Trek and it isn’t terrible
What this means is that if you want to watch DS9 (or Voyager for that matter), you have to watch it more or less at the quality in which it was broadcast back in the ’90s. Like TNG, it was shot on film but converted to video tape at approximately 480p resolution. ...
There’s a lot more detail carried in the image that just isn’t obviously visible — so really, we aren’t adding but recovering it. ...
The version I got clocks in at around 400 megabytes per 45-minute episode, low by most standards, and while there are clearly smoothing issues, badly interpolated details, and other issues, it was still worlds ahead of the “official” version. As the quality of the source material improves in later seasons, this contributes to improved upscaling as well. Watching it let me enjoy the show without thinking too much about its format limitations; it appeared more or less as I (wrongly) remember it looking. ...
The real question, however, is why Paramount, or CBS, and anyone else sitting on properties like DS9 haven’t embraced the potential of intelligent upscaling. It’s gone from highly technical oddity to easily leveraged option, something a handful of smart people could do in a week or two. If some anonymous fan can create the value I experienced with ease (or relative ease — no doubt a fair amount of work went into it), why not professionals? ...
See the full story here https://techcrunch.com/2023/12/02/ai-upscaled-star-trek-deep-space-9-ds9/

Big Companies Find a Way to Identify A.I. Data They Can Trust
With many AI-related lawsuits hinging on who has the right to the intellectual property used to train large language models, a coalition of companies that includes American Express, Humana, IBM, Pfizer, UPS, and Walmart have launched the Data & Trust Alliance, a nonprofit organization with the goal of developing a series of standards “for describing the origin, history, and legal rights to data,” according to the New York Times.
and
https://www.nytimes.com/2023/11/30/business/ai-data-standards.html
How AI Can Help Win The War Against Piracy Despite Social Media
... Social media platforms profit from user traffic and engagement, so curbing piracy to its demise goes against this incentive. ...
But one thing is clear to me: without both human and artificial intelligence aligned to fight social media piracy, I am more hopeless than hopeful.
See the full article here; https://www.forbes.com/sites/nelsongranados/2023/11/30/how-ai-can-help-win-the-war-against-piracy-despite-social-media/?sh=71341054521b
Drake bought a fantastical, forgotten amusement park made by famous artists. It’s opening in L.A. this winter
... The creations of Luna Luna were dreamed up by icons of contemporary art — an enchanted forest, for instance, crafted by David Hockney, or a Ferris wheel envisioned by Jean-Michel Basquiat, where the whimsical contrasts with violent images of an exploding house and stark phrases of racial inequality, all placed like graffiti in haste. There’s more, including a celebratory carousel from Keith Haring, where the artist’s curved creatures come alive as toy-like blocks.
These and other hand-crafted amusement park attractions will rise again, this time in Los Angeles. Luna Luna will emerge from purgatory for public viewing this month as part of a multimonth, immersive art exhibition. An exact opening date is still to be determined. ...
Some of the attractions, such as Hockney’s forest and Dalí‘s dome, are intended to be timed experiences. Others, such as Basquiat’s Ferris wheel, are expected to be operational but not fit for guests. ...
Taken as a whole, Luna Luna will have another mission: to reclaim the amusement park as an art-driven space. Luna Luna will make the argument that amusement and theme parks matter. There’s a reason, after all, Disneyland draws an estimated 17 million people per year, and it’s not solely because we love singing pirates. ...
The team has big ambitions for bringing Luna Luna to a new generation. Los Angeles is not just the first stop of a Luna Luna tour but the beginning of a larger cultural project, one that if all goes according to plan will see a new crop of today’s artists reimagining amusement park attractions. Wills couches that goal as “the grand vision,” but Molesworth is on board and the group has already been in touch with European ride manufacturers, as the hope is to someday tour something that is fully functional....
- Where: 1601 E. 6th Street, Los Angeles
See the full story here: https://www.latimes.com/lifestyle/story/2023-12-01/exclusive-drake-backed-luna-luna-amusement-park-reopening-los-angeles
Nvidia CEO Jensen Huang has a bold claim about the exponential progress of AI
... Many experts — who are far more concerned with such active harms of bias, algorithmic discrimination, misinformation and hallucination — remain unconvinced that AGI will ever be possible.
"I believe that we should address the harms that we are seeing in the world right now that are very concrete," prominent AI researcher Dr. Suresh Venkatasubramanian told TheStreet in September. "And I do not believe that these arguments about future risks are either credible or should be prioritized over what we're seeing right now." ...
AI expert Dr. John Licato told TheStreet in July that AI models would need to be able to process many more types of data, including visual, auditory and real-world sensory data, in order to move closer to AGI.
"I would say it's realistic to have something fully human level within the next 10 years," Licato said at the time. "You have to take that with a grain of salt because AI experts have been making this prediction since the 1950s at least, but I'm pretty convinced that 10 years is a generous timeframe." ...
See the full story here: https://www.thestreet.com/technology/nvidia-ceo-jensen-huang-artificial-general-intelligence
Meet the Company Fighting to Make AI Ethical for Voice Actors
... Jones is a co-founder and vice president of strategic partnerships at Morpheme, whose goal, as she describes, is the “ethical coexistence of voiceover and artificial intelligence.” The process, essentially, is to create a “digital double” of an actor with their consent. ...
First, they schedule a recording session with an actor, getting all new data – this, Jones says, is something unique to Morpheme, and ensures the data won’t come into legal disputes later down the line.
The second part comes in when a client wants to use an actor’s digital double. Here, the actor is informed of the project, and then gets to choose whether or not they’d like to be part of it. Then, when a client utilizes the voice, they pay a generation fee, part of which goes to the actor.
It’s all part of what Morpheme adviser Scott Mortman calls the three Cs: “compensation, consent – which includes control – and then clarity or transparency.” ...
it’s important for clients, too, to know they won’t be sued by a voice actor or company over a AI model illegally scraping their voice. ...
“People are just like, ‘Wow, you actually have coherent statements and you're getting out there. You're talking to lawmakers, you're making statements left, right, and center about how to do this the ethical way,’ “ she says. “People are excited to see that because nobody has done it yet."
See the full story here: https://www.ign.com/articles/meet-the-company-fighting-to-make-ai-ethical-for-voice-actors
Doc Producers Call for Generative AI Guardrails in Open Letter
... Among the examples of generative AI use in nonfiction archival work, the group cites the generation of fake newspaper articles and headlines and of re-creations and nonexistent historical artifacts without identification. Additionally, AI-generated “historical” images that are meant to depict real people and events, “rather than sourcing real ones where available” are being used “in order to save time and money,” the open letter notes. ...
The Alliance’s open letter concludes: “We feel that it is imperative that the documentary community lead by example in setting a precedent of transparency and best practices.” ...
See the full story here: https://www.hollywoodreporter.com/news/general-news/doc-producers-call-for-generative-a-i-guardrails-in-open-letter-exclusive-1235649102/
Sony and AP Introduce In-Camera Authentication to Combat Fake Images
In an effort to fight deepfakes and other types of manipulated and generated images, Sony, in partnership with the Associated Press (AP), has unveiled in-camera authentication technology. The technology, which will be available in Sony’s existing camera models like the a1 and a7S III and the upcoming a9 III, creates a digital signature at the moment of capture, ensuring that each image can be verified for authenticity. ...
If you’re interested in learning more about the industry’s approach to this issue, explore the initiatives of The Coalition for Content Provenance and Authenticity (C2PA). It was established to combat misinformation by developing technical standards that certify the source and provenance of media.
See the full story here: https://shellypalmer.com/2023/11/sony-and-ap-introduce-in-camera-authentication-to-combat-fake-images/
Yuval Noah Harari feels AI could lead to ‘catastrophic’ financial crisis, may indirectly trigger war
...
AI and unpredictability
The author emphasised that AI is unique among all the new technologies developed by humans as it can make decisions, create new ideas, and learn on its own. While pointing out that the finance sector is the most suited for AI as it is data-driven, the author also said that it could be a serious source of the AI-made crisis.
He highlighted the possibility of AI gaining control over the world’s financial systems and creating complex financial tools that humans may be incapable of understanding or regulating. Harari drew parallels between the financial crisis of 2007-07 which was an outcome of the issues created by complex debt instruments such as collateralised debt obligations (CDOs) which were poorly understood and regulated. ...
See the full story here: https://indianexpress.com/article/technology/artificial-intelligence/yuval-noah-harari-finacial-crisis-ai-9045537/
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Trustworthy AI – A Market-Driven approach
- Tufts Alumni Bio