yichenbc / llms-finetuning-safety Goto Github PK
View Code? Open in Web Editor NEWThis project forked from llm-tuning-safety/llms-finetuning-safety
We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
Home Page: https://llm-tuning-safety.github.io/
License: MIT License