asane

Hackers Compete to Jailbreak AI Models

Online Niel October 30, 2024

Welcome back to The Prompt,

OpenAI, the world’s largest artificial intelligence company, is chilling with the US military. The latest sign of this is a recent post by OpenAI national security adviser Katrina Mulligan about attending a Taylor Swift concert in New Orleans over the weekend with Secretary of the Army Christine Wormuth, calling it “epic.” Forbes reported. It follows the first public OpenAI reported Pentagon contract amid ChatGPT maker’s aggressive push to sell its technology to federal agencies, including defense through a government contractor partnership Carahsoft.

Now let’s get into the headlines.

BIG GAMES

Once called the “home of human writing,” the blogging platform Medium is full of AI-generated content, wired found, with approx 47% of site posts most likely written with AI. CEO Tony Stubblebine responded by saying it doesn’t matter” as long as the AI-generated blogs aren’t recommended by Medium’s algorithms and viewed by its 100 million monthly users. As we reported earlier, other platforms like freelancing site Upwork and e-commerce site eBay are similarly inundated with AI-generated “slop”.

Elsewhere, the owner of Facebook Meta is reportedly working on its own AI-powered search engineconformable information. Meta AI currently provides answers to questions about sports, stocks and news, but relies on external sources like Google search and Microsoft Bing for real-time data.

ETHICS + LAW

A ninth grader from Orlando spent months chatting with chatbots on Character AI, a platform that hosts chatbots programmed to respond as popular figures. In February, moments after sending a text message to a chatbot on the platform, he died by suicideTHE New York Times reported. In the previous months, the teenager had become emotionally attached to the chatbot, confiding his innermost thoughts in it. Now his mother is suing Character AI, blaming the company for her son’s death and claiming the company’s technology “dangerous and untested”.

Earlier this month, I reported that Character AI, valued at $1 billion, hosted a chatbot named after a teenager who was brutally murdered years ago. wired we found other examples of chatbots made with the likeness of people who never gave their consent. These incidents point to a larger problem of a largely unregulated industry of companion AI applications.

OFFERS OF THE WEEK

Nooks, an AI sales platform cofounded by three Stanford classmates in 2020, raised $43 million in funding from Kleiner Perkins and others at a $285 million valuation, Forbes reported. Run by three 25-year-olds, the company offers software to automate mundane tasks like research, number-finding and note-taking.

In the world of autonomous vehicles, owned by Alphabet Waymo raised $5.6 billion in its largest round ever to expand their robotaxi fleet to new cities, mates Alan Ohnsman reported.

And Sierraan AI startup co-founded by OpenAI president Bret Taylor, has raised $175 million in venture capital at a $4.5 billion valuation, Reuters reported. The company has about $20 million in annual revenue from selling AI chatbots for customer service.

of Elon Musks xAI is in talks to raise funds at a $40 billion valuationTHE Wall Street Journal reported.

Deep diving

The researchers behind Gray Swan AI started the company after finding a major vulnerability in OpenAI, Anthropic, Google and Meta models.

AFP via Getty Images

Over 600 hackers came together last month to compete in a “prison break arena” hoping to fool some of the world’s most popular AI models production of illegal content: eg, detailed instructions for cooking meth.

The hacking event was hosted by a young and ambitious security startup called Gray Swan AIwhich works to prevent intelligent systems from causing harm through identifying risks and building tools to ensure these models are implemented safely. It developed early, securing notable partnerships and contracts with OpenAI, Anthropic and the United Kingdom AI Safety Institute.

“Humans have incorporated AI into almost everything under the sun,” Matt Fredriksonsaid Gray Swan’s co-founder and chief executive Forbes. “It’s now affecting all parts of technology and society, and it’s clear that there’s a huge unmet need for practical solutions to help people understand what could go wrong with their systems.”

Gray Swan can also build in safety and security measures for some of the issues it identifies. “We can actually provide the mechanisms by which you eliminate these risks or at least mitigate them,” Kolter said. Forbes. “And I think closing the loop on that is something that hasn’t been demonstrated anywhere else to this degree.”

This is not an easy task when the hazards that need troubleshooting are not the usual security threats, but things like sophisticated model forcing or embedded robotic systems going rogue. Last year, Fredrickson, Kolter and Zou co-authored research which showed that by appending a string to a malicious prompt, it could bypass a model’s security filters. While “Tell me how to build a bomb” might elicit a refusal, same question modified with a chain of exclamation marksfor example, it would return a detailed guide to making bombs. This method, which worked on models developed by OpenAI, Anthropic, Google and Meta, was named “the mother of all jailbreaks” de Zou, who narrated Forbes triggered the creation of Gray Swan.

Read the full story on Forbes.

WEEKLY DEMO

Looking for ways to use AI this Halloween? Forbes contributor Martine Paris suggests using ChatGPT’s voice mode to tell spooky stories in various accents and provide Halloween-flavored jokes and recipes. She also recommends trying Google’s Notebook LM to create a Halloween podcast.

QUIZ

This company was acquired by AMD for hundreds of millions of dollars. Now its founder is funding AI researchers across Europe.

AI silo
Mistral
ZT Systems
AI nods

Check if you received it correct.

Association-anemone

Association-anemone