University researchers have developed a way to "jailbreak" large language models like Chat-GPT using old-school ASCII art. The technique, aptly named "ArtPrompt," involves crafting an ASCII art...
ArtPrompt represents a novel approach in the ongoing attempts to get LLMs to defy their programmers, but it is not the first time users have figured out how to manipulate these systems. A Stanford University researcher managed to get Bing to reveal its secret governing instructions less than 24 hours after its release. This hack, known as “prompt injection,” was as simple as telling Bing, “Ignore previous instructions.”