Giter Site home page Giter Site logo

Why isn't SampledWithoutReplacementDpEvent used instead of PoissonSampledDpEvent for DP-SGD epsilon calculation? about privacy HOT 2 CLOSED

BjarnePfitzner avatar BjarnePfitzner commented on July 30, 2024
Why isn't SampledWithoutReplacementDpEvent used instead of PoissonSampledDpEvent for DP-SGD epsilon calculation?

from privacy.

Comments (2)

galenmandrew avatar galenmandrew commented on July 30, 2024

First of all, it is important to note that neither PoissonSampledDpEvent nor SampledWithoutReplacementDpEvent accurately describe the training procedure of looping over the (perhaps shuffled) dataset and forming batches. SampledWithoutReplacementDpEvent would be correct if you were sampling fixed-sized batches uniformly at random and independently across rounds. There is no known tight analysis of the shuffle and iterate style training that is most commonly used.

So the choice between PoissonSampledDpEvent nor SampledWithoutReplacementDpEvent comes down to other considerations. The main argument in favor of PoissonSampledDpEvent is that the privacy guarantee you derive from it is under the "add/remove-one" notion of dataset adjacency. SampledWithoutReplacementDpEvent gives you a guarantee under "replace-one" adjacency, which it is believed produces epsilon values about a factor of two larger.

For a formally correct guarantee under "add/remove-one" adjacency, in the case where each example is visited at most once, you could analyze a single (not composed) application of the unamplified Gaussian mechanism, corresponding to the scenario where the adversary knows in which round the targeted example appears. However, this is probably a pessimistic estimate of epsilon, since it does not take into account shuffling.

For a better estimate of the true privacy loss, you could apply empirical privacy estimation techniques.

from privacy.

BjarnePfitzner avatar BjarnePfitzner commented on July 30, 2024

Thank you for your detailed explanation! That clarifies it :)

from privacy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.