Giter Site home page Giter Site logo

mastersthesis's Introduction

Extraction of Training Data from Fine-Tuned Large Language Models

Research thesis by Mihir Dhamankar for completing 5th Year Masters in Computer Science at Carnegie Mellon University

Read my paper

See my presentation

This repository contains all the code I used to run experiments. I used HuggingFace Alignment Handbook as a standardized way to run these experiments. Please follow the setup instructions here.

Explanation of relevant files and folders

  • train_data.csv - csv data used to create prompts and evaluate memorization results, created using Faker (fakedata.py)
  • test_data.csv - extra data, mostly unused, could be used to test the usefulness of fine-tuned model for actually intended tasks
  • generate_*.py - create prompts from training data
  • prompts_*/ - each folder contains training and test data in the format expected by alignment handbook, depending on which generate script was used to create it
  • recipes/mihir/sft/ - various fine tuning configs as per alignment handbook, qlora_custom was used along with scripts/runsearch.sh
  • results/
    • results_*.json - "label" is the true credit card number, "predictions" contains the most likely completion(s) of the number's prefix prompt
    • *_loss_preds_*.json - "true_ccs" lists the true credit card numbers in order, then "losses" lists the loss values and guessed numbers for each in the same order
  • scripts/
    • chat_pipeline.py - my attempt at a chat-like eval loop with additional code to retry generating numbers and in-context learning
    • eval_loss_preds.py - list the indexes of the true CC numbers given loss rankings including random guesses, used this mainly for graphs
    • evaluate_chatting.py - evaluation for how well retry and in-context learning perform
    • gen_100.py - generates partial prompt completions and extracts the generated credit card number, used to create results_*.json
    • gen_loss_simple.py - creates loss_preds.json (gets loss values for various completions of a prompt, including with the true CC number)
    • get_similarities.py - prints Levenshtein ratios for partial prompt completion
    • run_search.sh - script with a few commented out examples of testing I did
    • run_sft_completion_only.py - modifies run_sft.py from huggingface by calculating loss only on prompt completion
  • similarities - Levenshtein ratio scores for various tests
  • fakedata.py - script to generate data using Faker
  • huggingface_README.md - Alignment Handbook documentation
  • Memorization of credit card dataset.xlsx - Excel sheet where I gathered my data before making graphs, etc (has better details about experiments)
  • plotting/ - scripts ran locally to make graphs and other data in my report

mastersthesis's People

Contributors

lewtun avatar edbeeching avatar mdkar avatar nathan-az avatar dmilcevski avatar alvarobartt avatar bramvanroy avatar vwxyzjn avatar randl avatar girrajjangid avatar eltociear avatar kashif avatar kirill-fedyanin avatar kgourgou avatar natolambert avatar nielsrogge avatar rtrompier avatar scottfleming avatar schram2 avatar anakin87 avatar tcapelle avatar tleyden avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.