Early jailbreak attempts that worked on GPT-3.5 or early versions of Bard (Gemini’s predecessor) are largely obsolete. Let’s look at why.

: The model can iteratively rephrase a "harmful" request into a "benign" one that still gets the user's intent. This is also known as the "Simple Black-Box" method.