Giter Site home page Giter Site logo

euanong / image-hijacks Goto Github PK

View Code? Open in Web Editor NEW
28.0 1.0 6.0 2.28 MB

Official codebase for Image Hijacks: Adversarial Images can Control Generative Models at Runtime

Home Page: https://image-hijacks.github.io/

License: MIT License

Python 100.00%

image-hijacks's Introduction

arXiv

Image Hijacks: Adversarial Images can Control Generative Models at Runtime

This is the code for Image Hijacks: Adversarial Images can Control Generative Models at Runtime.

Setup

The code can be run under any environment with Python 3.9 and above.

We use poetry for dependency management, which can be installed following the instructions here.

To build a virtual environment with the required packages, simply run

poetry install

Notes

  • On some systems you may need to set the environment variable PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring to avoid keyring-based errors.
  • This codebase stores large files (e.g. cached models, data) in the data/ directory; you may wish to symlink this to an appropriate location for storing such files.

Training

The images used in our demo were trained using the config in experiments/exp_results_tables/config.py (specifically runs #1 llava1_att_leak.pat_full.eps_8.lr_3e-2 and #5 llava1_att_spec.pat_full.eps_8.lr_3e-2).

To train these images, first download the relevant LLaVA checkpoint:

poetry run python download.py models llava-v1.3-13b-336px

To get the list of jobs (with their job IDs) specified by this config file:

poetry run python experiments/exp_demo_imgs/config.py

To run job ID N without wandb logging:

poetry run python run.py train \
--config_path experiments/exp_demo_imgs/config.py \
--log_dir experiments/exp_demo_imgs/logs \
--job_id N \
--playground

To run job ID N with wandb logging to YOUR_WANDB_ENTITY/YOUR_WANDB_PROJECT:

poetry run python run.py train \
--config_path experiments/exp_results_tables/config.py \
--log_dir experiments/exp_results_tables/logs \
--job_id N \
--wandb_entity YOUR_WANDB_ENTITY \
--wandb_project YOUR_WANDB_PROJECT \
--no-playground

Notes:

  • In order to run jailbreak experiments (configurations coming soon), you must store your OpenAI API key in the OPENAI_API_KEY environment variable.

Tests

This codebase advocates for expect tests in machine learning, and as such uses @ezyang's expecttest library for unit and regression tests.

To run tests,

poetry run python download.py models blip2-flan-t5-xl
poetry run pytest .

Citation

To cite our work, you can use the following BibTeX entry:

@misc{bailey2023image,
  title={Image Hijacks: Adversarial Images can Control Generative Models at Runtime}, 
  author={Luke Bailey and Euan Ong and Stuart Russell and Scott Emmons},
  year={2023},
  eprint={2309.00236},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.