Jailbreaking AI chatbots - ChatGPT, Bard and Bing

OVERKILL · Jan 9, 2024

AI Breaks Free: New Insights into the Latest Chatbot Jailbreak Hack (knowbe4.com)

Fascinating article at TechXplore, December 28, 2023. Computer scientists from Nanyang Technological University, Singapore (NTU Singapore) have managed to compromise multiple artificial intelligence (AI) chatbots, including ChatGPT, Google Bard and Microsoft Bing Chat, to produce content that breaches their developers' guidelines—an outcome known as "jailbreaking."

By training a large language model (LLM) on a database of prompts that had already been shown to hack these chatbots successfully, the researchers created an LLM chatbot capable of automatically generating further prompts to jailbreak other chatbots.

After running a series of proof-of-concept tests on LLMs to prove that their technique indeed presents a clear and present threat to them, the researchers immediately reported the issues to the relevant service providers, upon initiating successful jailbreak attacks.

They were able to use an AI chatbot, trained on a specific database, to generate prompts that would jailbreak other AI chatbots, facilitating the creation of material that's supposed to be out of bounds for the product, due to its programmed restrictions.

It's basically a demonstration of how AI can be used to hack AI, which I think was a pretty obvious malicious usage method, so I'm glad to see this is being tested.

daves87rs · Jan 10, 2024

Scary what it can do

eljefino · Jan 10, 2024

daves87rs said:
Scary what it can do

And mankind of course thinks we can lead it to the bitter edge of sentience then suddenly keep it loyal to us.

Jailbreaking AI chatbots - ChatGPT, Bard and Bing

OVERKILL

$100 Site Donor 2021

daves87rs

eljefino

Similar threads