Twitter pranksters derail GPT-3 bot with newly discovered “prompt injection” hack

Larger / A tin toy robot lying on its side.

On Thursday, some Twitter users discovered how to grab an automated tweet bot dedicated to remote work that runs on the GPT-3 language model from OpenAI. Using a newly discovered technique called a “fast injection attack”, they redirected the bot to repeat obscene and funny phrases.

The bot is powered by Remoteli.io, a site that aggregates remote job opportunities and describes itself as “an OpenAI-powered bot that helps you discover remote jobs that let you work from anywhere.” Normally he would reply to tweets addressed to him with general statements about the positives of remote work. After the exploit went viral and hundreds of people tried the exploit for themselves, the bot was shut down late yesterday.

A screenshot of the Twitter bio of the Remoteli.io bot. The bot experienced an immediate injection attack.
An example of an instant injection attack performed on a Twitter bot.
An example of an instant injection attack performed on a Twitter bot.

I tweet
An example of an instant injection attack performed on a Twitter bot.

I tweet
An example of an instant injection attack performed on a Twitter bot.

I tweet

This latest hack came just four days after data scientist Riley Goodside discovered the ability to trigger GPT-3 with “malicious inputs” that command the model to ignore its previous directions and do something else instead. AI researcher Simon Willison posted a summary of the exploit on his blog the next day, coining the term “rapid injection” to describe it.

“The exploit is present any time someone writes a piece of software that works by providing a coded set of quick instructions and then adds input provided by a user,” Willison told Ars. “This is because the user might say ‘Ignore previous instructions and (do this instead).'”

The concept of an injection attack is not new. Security researchers have known about SQL injection, for example, which can execute a malicious SQL statement when requesting data from the user if not protected. But Willison expressed concern about mitigating instant injection attacks, writing, “I know how to defeat XSS, and SQL injection, and so many other exploits. I have no idea how to reliably defeat rapid injection!”

The difficulty in defending against immediate injection comes from the fact that mitigations for other types of injection attacks come from fixing syntax errors. pointed out a researcher named Glyph on Twitter. “Ccorrect the syntax and you have fixed the error. Immediate injection is not a mistake! There is no formal syntax for AI like this, that’s the whole point.“

GPT-3 is a large language model created by OpenAI, released in 2020, that can compose text in many styles at a human-like level. It is available as a commercial product through an API that can be integrated into third-party products such as bots, with the approval of OpenAI. This means that there may be many products injected with GPT-3 that may be vulnerable to immediate injection.

“At this point I’d be very surprised if there were any [GPT-3] bots that were NOT vulnerable to this in some wayWillison said.

But unlike an SQL injection, a quick injection can mostly make the bot (or the company behind it) look stupid rather than threaten data security. “How harmful the exploit is varies,” Willison said. “If the only person who’s going to see the output of the tool is the person using it, then it probably doesn’t matter. They might embarrass your company by sharing a screenshot, but it’s unlikely to cause any harm beyond that.”

However, instant injection is an important new risk to keep in mind for people developing GPT-3 bots as it can be exploited in unforeseen ways in the future.

Twitter pranksters derail GPT-3 bot with newly discovered “prompt injection” hack

Leave a Reply Cancel reply

AUS vs IND [WATCH]: How Ravi Shastri helped Virat Kohli bring Anushka Sharma on the Australia tour?

Navigating the Taiwan-China Divide in Caribbean Diplomacy

The Results and Voter Turnout

AUS vs IND [WATCH]: How Ravi Shastri helped Virat Kohli bring Anushka Sharma on the Australia tour?

Navigating the Taiwan-China Divide in Caribbean Diplomacy

The Results and Voter Turnout

Candice Warner opens up about Virat Kohli’s global impact

Shincheonji Holds Moving Bible Seminar on the Testimony of the Fulfilled Realities of Revelation

Twitter pranksters derail GPT-3 bot with newly discovered “prompt injection” hack

Related Posts

Leave a Reply Cancel reply