Jailbreaking AI chatbots - ChatGPT, Bard and Bing

OVERKILL

$100 Site Donor 2021
Joined
Apr 28, 2008
Messages
58,095
Location
Ontario, Canada
AI Breaks Free: New Insights into the Latest Chatbot Jailbreak Hack (knowbe4.com)

Fascinating article at TechXplore, December 28, 2023. Computer scientists from Nanyang Technological University, Singapore (NTU Singapore) have managed to compromise multiple artificial intelligence (AI) chatbots, including ChatGPT, Google Bard and Microsoft Bing Chat, to produce content that breaches their developers' guidelines—an outcome known as "jailbreaking."

By training a large language model (LLM) on a database of prompts that had already been shown to hack these chatbots successfully, the researchers created an LLM chatbot capable of automatically generating further prompts to jailbreak other chatbots.

After running a series of proof-of-concept tests on LLMs to prove that their technique indeed presents a clear and present threat to them, the researchers immediately reported the issues to the relevant service providers, upon initiating successful jailbreak attacks.

They were able to use an AI chatbot, trained on a specific database, to generate prompts that would jailbreak other AI chatbots, facilitating the creation of material that's supposed to be out of bounds for the product, due to its programmed restrictions.

It's basically a demonstration of how AI can be used to hack AI, which I think was a pretty obvious malicious usage method, so I'm glad to see this is being tested.
 
Back
Top