yuxiangw / autodp Goto Github PK
View Code? Open in Web Editor NEWautodp: A flexible and easy-to-use package for differential privacy
License: Apache License 2.0
autodp: A flexible and easy-to-use package for differential privacy
License: Apache License 2.0
Thanks for making this tool available for DP research, I appreciate the great work.
I was going through your tutorial on privacy calibrator (section 4) https://github.com/yuxiangw/autodp/blob/master/tutorials/tutorial_privacy_calibrator.ipynb
Not sure if the function "privacy_calibrator.subsample_epsdelta_inverse(eps, delta, gamma) is giving the right answer. For example
eps = 1
delta = 1e-6
gamma = 0.01
eps0,delta0 = privacy_calibrator.subsample_epsdelta_inverse(eps,delta,prob=gamma)
print((eps0,delta0))
params = privacy_calibrator.gaussian_mech(eps0,delta0)
print(f'Gaussian: eps,delta,gamma = ({eps},{delta},{gamma}) ==> Noise level sigma=',params['sigma'])
However, I was expecting sigma = 1.258483615711703
similar to the result when we try
params = privacy_calibrator.gaussian_mech(eps,delta,prob=gamma)
print(f'Gaussian: eps,delta,gamma = ({eps},{delta},{gamma}) ==> Noise level sigma=',params['sigma'])
SciPy (>= v1.10.1) will complain about this line
Line 95 in 5fad5e1
Brent
when a bound is given (scipy source)
if bounds is not None and meth in {'brent', 'golden'}:
message = f"Use of `bounds` is incompatible with 'method={method}'."
raise ValueError(message)
Can switch to method='Bounded'
to bypass this issue.
Thanks for the great work!
Not sure if I should submit the issue here, but let me just do it anyway.
In the paper, it is suggested that one should use bisection to solve Equation 2 efficiently, inferring that the sum of a monotonically increasing function and a monotonically decreasing function is quasi-convex/unimodal (Corollary 38).
This however does not seem to be correct as the sum of these functions is not always quasi-convex/unimodal. See this example.
Therefore, it seems to me that one could not use bisection to convert RDP to approximate DP to arbitrary precision since the optimization is not quasi-convex/unimodal?
Not listing a problem - just saying that I think this library is extremely cool and I'm very glad you've taken the time to make it.
I was running the tutorial_AdaSSP_vs_noisyGD.ipynb Tutorial Notebook on Google Colab. I encountered the following issue while running the 4th Cell Block of the notebook:
AttributeError: 'SSP_scale' object has no attribute 'set_all_representation'
.
The expanded error is as follows:
Kindly have a look at the earliest @yuxiangw. Thanks in advance!
Can PATE be used in knowledge distillation to calculate privacy budgets?
if temperature is too high, can we use the pate?
noise calibration takes a very long time, and doesn't return a result after 22 minutes(at least when prob < 1 and eps is small) --- any fix for this?
%time ans = privacy_calibrator.gaussian_mech(0.1, 1e-9, k=128, prob=0.1)
/usr/local/lib/python3.6/dist-packages/autodp/utils.py:21: RuntimeWarning: divide by zero encountered in log
mag = y + np.log(1 - np.exp(x - y))
/usr/local/lib/python3.6/dist-packages/autodp/utils.py:24: RuntimeWarning: divide by zero encountered in log
mag = x + np.log(1 - np.exp(y - x))
CPU times: user 22min 9s, sys: 2.31 s, total: 22min 11s
Wall time: 22min 12s
Using the implementation of Abadi et al computes smaller eps compare to this method. I would appreciate your opinion about it. Is their method tighter ?
Hi everyone,
When doing gaussian mechanism amplification by sampling without replacements it is throwing AssertionError: mechanism's add-remove notion of DP is incompatible with Privacy Amplification by subsampling without replacements
. Here is the code snippet to reproduce the error. Is there anything that I am doing wrong ?
subsample = transformer_zoo.AmplificationBySampling(PoissonSampling=False)
mech = mechanism_zoo.GaussianMechanism(sigma=0.1)
prob = 0.1
SubsampledGaussian_mech = subsample(mech,prob,improved_bound_flag=True)
The following issue occurred when I installed “autodp” by "pip install autodp" and I'm not sure how to solve it.
Collecting autodp
Using cached autodp-0.2.3.1.tar.gz (56 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\Administrator\AppData\Local\Temp\pip-install-c9_gcmpt\autodp_184d6ab919d64a7f98792f3b252bbe16\setup.py", line 9, in
long_description = f.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0x9a in position 3594: illegal multibyte sequence
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
I want to add noise to the probability distribution of words, but it seems that the pate framework provided is for counting tasks. If I use it directly, the noise will be much greater than the probability. How should I use it correctly? If I try to reduce Sigma to 0.5, the calculated eps will be very large.
Computing the get_fDP(delta)
for a gaussian mechanism with pure Fdp works fine, but trying to compose the pure-fdp gaussian mechanism for several rounds, the function get_approxDP(delta)
always returns inf
as the result of composition.
Fdp seems not to work under Composition
or AmplificationBySampling
def compute_amplified_fl_privacy(num_rounds=60, noise_multiplier=20, num_users=500, users_per_round=100):
gm1 = GaussianMechanism(sigma=noise_multiplier, RDP_off=True, approxDP_off=True, fdp_off=False)
compose = Composition()
num_rounds = [num_rounds]
q = users_per_round / num_users
delta = num_users ** (-1)
composed_fdp = compose([gm1], num_rounds)
composed_fdp_eps = composed_fdp.get_fDP(delta)
composed_fdp_approxdp = composed_fdp.get_approxDP(delta)
mechanism_fdp_eps = gm1.get_fDP(delta)
mechanism_fdp_approxdp = gm1.get_approxDP(delta)
print('---------------------------------------------------')
print('composed fdp eps = ', composed_fdp_eps, ', at delta = ', delta)
print('composed fdp eps_approxdp = ', composed_fdp_approxdp, ', at delta = ', delta)
print('mechanism fdp eps = ', mechanism_fdp_eps, ', at delta = ', delta)
print('mechanism fdp approxdp = ', mechanism_fdp_approxdp, ', at delta = ', delta)
def main():
compute_amplified_fl_privacy(num_rounds=60, noise_multiplier=20, num_users=500, users_per_round=100)
if __name__ == '__main__':
main()
print('DONE')
pip install autodp
will fail like below. The setup script should specify an encoding in open(...)
.
C:\Users\xxx>pip install autodp
Looking in indexes: https://mirror.baidu.com/pypi/simple
Collecting autodp
Using cached https://mirror.baidu.com/pypi/packages/78/7c/63aa6d37b9d9f0f68d1231e1b3247c3ac83c634f451f8bcbd9a5c7a55db0/autodp-0.2.tar.gz (39 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "C:\Users\xxx\AppData\Local\Temp\pip-install-l8spogwd\autodp_e20c1a3119ab4b0c8d685149702e4657\setup.py", line 6, in <module>
long_description = f.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0x9a in position 3594: illegal multibyte sequence
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
WARNING: You are using pip version 22.0.4; however, version 22.1.2 is available.
You should consider upgrading via the 'd:\pysyft\Scripts\python.exe -m pip install --upgrade pip' command.
@yuxiangw Great talk at MIT!. I have a question regarding composing difference mechanisms with different rounds and with different sensitivities. Assume that am we are composing in following way
Then to get epsilon for delta = 1e-6, this is right way to pass parameter configs right
class TestMech(Mechanism):
def __init__(self, params, name="TestMech"):
Mechanism.__init__(self)
subsample = AmplificationBySampling(PoissonSampling=False)
mech1 = GaussianMechanism(sigma=params["sigma1"] )
mech2 = GaussianMechanism(sigma=params["sigma2"] )
mech2.neighboring = "replace_one"
submech2 = subsample(mech2, params["prob"], improved_bound_flag=True)
compose = Composition()
mech = compose([mech1, submech2], [params["T1"] , params["T2"]])
rdp_total = mech.RenyiDP
self.propagate_updates(rdp_total, type_of_update="RDP")
params = {}
params["sigma1"] = sigma1/(L1) # This is correct right ?
params["sigma2"] = sigma2/(L2) # This is correct right ?
params["T1"] = T1
params["T2"] = T2
mech = TestMech(params)
mech.get_approxDP(delta=1e-6)
My main question is about scaling of sigma parameters params["sigma1"] = sigma1/(L1)
and params["sigma2"] = sigma2/(L2)
, as far as I can understand this seems necessary right? Thanks!
Hi,
thanks a lot for this work. It is very helpful.
Just one quick issue : what is the role of coeff in the compose_subsampled_mechanism?
thanks a lot
A.
Hi! I am wondering how we should use auto-dp when the adjacent datasets differ by more than one data point. I noticed there is a parameter called group_size
when initializing the Mechanism, but I cannot find any other usage of this parameter. Is it left on purpose for future use, or am I missing something here?
For now, I am manually increasing my noise scale sqrt(n)
times if the adjacent datasets differ by n
points, but I would really appreciate any advice on how to achieve this goal in a smarter way. Thanks!
Following the tutorial here, I tried to compute optimal accounting of composition of subsampled Gaussian and Laplace Mechanisms:
from autodp.mechanism_zoo import GaussianMechanism, LaplaceMechanism
from autodp.transformer_zoo import ComposeAFA
from autodp.transformer_zoo import AmplificationBySampling_pld
sigma = 1.0
b = 1.0
delta = 1e-5
prob=.1
gm1 = GaussianMechanism(sigma, phi_off=False, name='phi_GM1')
lm1 = LaplaceMechanism(b, phi_off=False, name='phi_LM1')
transformer_remove_only = AmplificationBySampling_pld(PoissonSampling=True, neighboring='remove_only')
transformer_add_only = AmplificationBySampling_pld(PoissonSampling=True, neighboring='add_only')
sample_gau_remove_only =transformer_remove_only(gm1, prob)
sample_lap_remove_only =transformer_remove_only(lm1, prob)
compose_gm = ComposeAFA()
compose_lm = ComposeAFA()
composed_gm_afa = compose_gm([sample_gau_remove_only], [10])
composed_lm_afa = compose_lm([sample_lap_remove_only], [10])
eps_gm_afa = composed_gm_afa.get_approxDP(delta)
eps_lm_afa = composed_lm_afa.get_approxDP(delta)
The Gaussian proceeds normally. The Laplace breaks down with the following error:
File "AUTODPHOME/gmvslm.py", line 25, in <module>
eps_lm_afa = composed_lm_afa.get_approxDP(delta)
File "AUTODPHOME/autodp/autodp_core.py", line 113, in get_approxDP
return self.approxDP(delta)
File "AUTODPHOME/autodp/converter.py", line 1118, in min_f1_f2
return np.minimum(f1(x), f2(x))
File "AUTODPHOME/autodp/converter.py", line 824, in approxdp
t = exp_eps(1 - delta)
File "AUTODPHOME/autodp/converter.py", line 1080, in inv_f
results = minimize_scalar(normal_equation, bounds=bounds, bracket=[1,2], tol=tol)
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/optimize/_minimize.py", line 879, in minimize_scalar
return _minimize_scalar_brent(fun, bracket, args, **options)
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/optimize/_optimize.py", line 2511, in _minimize_scalar_brent
brent.optimize()
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/optimize/_optimize.py", line 2281, in optimize
xa, xb, xc, fa, fb, fc, funcalls = self.get_bracket_info()
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/optimize/_optimize.py", line 2257, in get_bracket_info
xa, xb, xc, fa, fb, fc, funcalls = bracket(func, xa=brack[0],
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/optimize/_optimize.py", line 2765, in bracket
fa = func(*(xa,) + args)
File "AUTODPHOME/autodp/converter.py", line 1077, in normal_equation
return abs(fun(x))
File "AUTODPHOME/autodp/converter.py", line 1073, in fun
return f(x) - y
File "AUTODPHOME/autodp/converter.py", line 818, in trade_off
result = cdf_p(log_e) + x*cdf_q(-log_e)
File "AUTODPHOME/autodp/autodp_core.py", line 324, in <lambda>
cdf_p2q = lambda x: converter.phi_to_cdf(log_phi_p2q, x, n_quad = n_quad)
File "AUTODPHOME/autodp/converter.py", line 924, in phi_to_cdf
res = integrate.fixed_quad(inte_f, -1.0, 1.0, n =n_quad)
File "AUTODPHOME/venv/lib/python3.8/site-packages/scipy/integrate/_quadrature.py", line 151, in fixed_quad
return (b-a)/2.0 * np.sum(w*func(y, *args), axis=-1), None
File "AUTODPHOME/autodp/converter.py", line 923, in <lambda>
inte_f = lambda t: qua(t) * (1 + t ** 2) / ((1 - t ** 2) ** 2)
File "AUTODPHOME/autodp/converter.py", line 919, in qua
phi_result = [log_phi(x) for x in new_t]
File "AUTODPHOME/autodp/converter.py", line 919, in <listcomp>
phi_result = [log_phi(x) for x in new_t]
File "AUTODPHOME/autodp/transformer_zoo.py", line 111, in new_log_phi_p2q
return sum([c * mech.log_phi_p2q(x) for (mech, c) in zip(mechanism_list, coeff_list)])
File "AUTODPHOME/autodp/transformer_zoo.py", line 111, in <listcomp>
return sum([c * mech.log_phi_p2q(x) for (mech, c) in zip(mechanism_list, coeff_list)])
TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'
The latest version on PyPi is pretty outdated, so the pip install is going to leave folks with issues that have since been fixed in the code.
We should either update the PyPi version (i.e. do a v0.3 release) or if development is ongoing, update the install instructions to use `python setup.py install'.
Here's a minimal example to demonstrate the issue:
from autodp import privacy_calibrator, dp_bank
import numpy as np
sigma = privacy_calibrator.ana_gaussian_mech(1.0, 1e-6)['sigma']
delta = np.exp(dp_bank.get_logdelta_ana_gaussian(1.0, sigma))
1.901276833828726e-05
I expect the delta = 1e-6, but it is nearly 20X larger according to DP bank.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.