Giter Site home page Giter Site logo

minchen00 / unlearningleaks Goto Github PK

View Code? Open in Web Editor NEW
45.0 4.0 5.0 36 KB

Official implementation of "When Machine Unlearning Jeopardizes Privacy" (ACM CCS 2021)

License: GNU General Public License v3.0

Python 100.00%
machine-unlearning membership-inference-attack machine-learning

unlearningleaks's Introduction

Unlearning-Leaks

This repository contains the implementation for When Machine Unlearning Jeopardizes Privacy (CCS 2021).

To run the code, you need first download the dataset, then train target models and shadow models, in the end, launch the attack in our paper.

Requirements

conda create --name unlearningleaks python=3.9
conda activate unlearningleaks
pip3 install sklearn pandas opacus tqdm psutil
pip3 install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1+cu111 -f https://download.pytorch.org/whl/cu111/torch_stable.html

Directory tree

.
├── LICENSE
├── __init__.py
├── config.py
├── data_prepare.py
├── exp.py
├── lib_unlearning
│   ├── attack.py
│   ├── construct_feature.py
│   └── record_split.py
├── main.py
├── models.py
├── parameter_parser.py
├── readme.md
├── temp_data
│   ├── attack_data
│   ├── attack_models
│   ├── dataset
│   ├── processed_dataset
│   ├── shadow_models
│   ├── split_indices
│   └── target_models
└── utils.py

Data Preparation

Toy examples

###### Step 1: Train Original and Unlearned Models ######
python main.py --exp model_train

###### Step 2: Membership Inference Attack under Different Settings ######

###### UnlearningLeaks in 'Retraining from scratch' ######
python main.py --exp mem_inf --unlearning_method scratch

###### UnlearningLeaks in 'SISA'
python main.py --exp model_train --unlearning_method sisa
python main.py --exp mem_inf --unlearning_method sisa

###### UnlearningLeaks in 'Multiple intermediate versions'
python main.py --exp mem_inf --samples_to_evaluate in_out_multi_version

###### UnlearningLeaks in 'Group Deletion'
python main.py --exp model_train --shadow_unlearning_num 10 --target_unlearning_num 10
python main.py --exp mem_inf --shadow_unlearning_num 10 --target_unlearning_num 10

###### UnlearningLeaks in 'Online Learning'
python main.py --exp model_train --samples_to_evaluate online_learning
python main.py --exp mem_inf --samples_to_evaluate online_learning

###### UnlearningLeaks against 'the remaining samples'
python main.py --exp mem_inf --samples_to_evaluate in_in

Citation

@inproceedings{chen2021unlearning,
  author    = {Min Chen and Zhikun Zhang and Tianhao Wang and Michael Backes and Mathias Humbert and Yang Zhang},
  title     = {When Machine Unlearning Jeopardizes Privacy},
  booktitle = {{ACM} {SIGSAC} Conference on Computer and Communications Security (CCS)}
  year      = {2021}
}

Related Work

[1] How to Combine Membership-Inference Attacks on Multiple Updated Models [Code]

unlearningleaks's People

Contributors

minchen00 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

unlearningleaks's Issues

The neg samples had exactly the same output on the original model and the unlearning model

I was testing the DT model on Adult and found that the neg samples had exactly the same output on the original model and the unlearning model, which is what caused the difference attack to work so well. But it shouldn't be possible for both models to have exactly the same output for the same sample? Is there some detail I am overlooking please?

This is the posterior difference dataset used for the attack model in the project, and it seems that there are many neg samples where duplicates occur and the posterior is exactly the same.
DT_adult_test

where's the implementation of classical MIA in this paper

Hi, i've checked the paper and the code thoroughly, and i cannot exactly confirm which is the classical MIA method your guys used for comparison. Could u plz provide more details about:

  1. Does classical MIA just determine <in,out>/<out,out> samples, i.e., samples to unlearn?
  2. Can classical MIA access to the unlearned model, like the related work in the code page mentioned? for example, using LIRA to combine these two models?
  3. what's the classical MIA approach exactly used in the experiment?

I'm looking forward to your reply. Thanks!

It looks like missing code is in the experiment.

I am sorry to bother you. I ran the experiment MIA. However, there are some parts missing. In line 345, the function _obtain_posterior() is incomplete. Could you please help to fix the error?
def _obtain_posterior(self, num_sample, num_shard, sample_name, save_path):
pass

Some problems about the datasets Insta-NY & Insta-LA

Sorry to bother you, I've checked the paper ,related paper and the code thoroughly,but I can't really understand how & where I can get these two datasets.While the dataset is collected from the Instagram API and Four Square API,I only get an invalid page now.Could you provide the URL of the dataset if you have or other methods through which I can download these datasets? Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.