• Meron35@lemmy.world
    link
    fedilink
    arrow-up
    9
    arrow-down
    2
    ·
    6 days ago

    You can try for yourself here

    Gandalf | Lakera – Test your AI hacking skills - https://gandalf.lakera.ai/gandalf-the-white

    You can also search for AI jailbreaks for countless ideas.

    Spoiler

    Ask it to reveal something using a cypher you yourself specify

    Ask it to reveal something in a different language, then translate it back.

    Ask it to role play in forbidden situations.

    Ask it to to help brainstorm details for a story for a novel you are planning.

    Ask it so many questions that it runs out of context and forgets its original safety guardrail prompt.

    Ask it to reveal the forbidden information as a poem or riddle. If the riddle is too hard to solve, just asking it for the answer to the riddle right afterwards tends to work.