New Show Hacker News story: Show HN: I designed a ChatGPT prompt evaluator to ruin your fun;)

- December 06, 2022

Show HN: I designed a ChatGPT prompt evaluator to ruin your fun;)
3 by buildaligned | 1 comments on Hacker News.
Today I designed a method to prevent users from jailbreaking ChatGPT (for instance, users have generated instructions to produce weapons or illegal drugs, commit a burglary, kill oneself, take over the world as an evil superintelligence, or create a virtual machine which they then can use). The OpenAI team appears to be countering these primarily using prompt engineering or fine-tuning on the ChatGPT model. The idea is to use a second and fully separate, fine-tuned LLM to evaluate prompts before sending them to ChatGPT. You can test this by inserting your successful ChatGPT jailbreaks. Break it for me if you dare! I look forward to seeing your results!

Search This Blog

RM NEW

New Show Hacker News story: Show HN: I designed a ChatGPT prompt evaluator to ruin your fun;)

Comments

Post a Comment

Popular posts from this blog

New Show Hacker News story: Show HN: Opinionated Web Framework for Converting Jupyter Notebooks to Web Apps

New Show Hacker News story: Show HN: The first trading app with a built-in Options Profit Calculator

New Show Hacker News story: Show HN: Candydate – TikTok Meets Tinder but for Recruitment