Giter Site home page Giter Site logo

Comments (3)

yoid2000 avatar yoid2000 commented on August 26, 2024 2

I don't think LED can compensate for not flattening outliers.

The design I've settled on will adjust for one AID, and this adjustment takes place before noise and flattening are computed. In fact it happens within the selectable containing the isolating condition.

So in the case of this attack, LED would insert the rows with ssn = f back into the computation, and there would be difference in the underlying data for both queries.

Can I suggest that we revisit this attack after LED is implemented in the reference implementation?

from reference.

sebastian avatar sebastian commented on August 26, 2024

The last row is put aside as unaccounted.

The last row is not put aside as unaccounted for. The unaccounted for business only applies if there is another AID for us to attach the information to, which in this case there is not.

The output will be 5 - 0 + 5 - 0 + noise(1) = 10 + noise(1).

The result, in this example, would be: 5 + noise(1).

Presumably there would additionally be a noise layer associated with the negative condition excluding ssn f, or LED compensating for the low effect exclusion of AID values.


If you can exclude a victim, then this is clearly not safe...

I think a premise for this attack to work would be if we have a query that meets the following criteria:

  • with multiple AIDs (this results in the unaccounted for functionality kicking in) where at least one of the AIDs has an AID value for the offending row
  • a condition or query construct that excludes the victim that isn't detected or compensated for by LED.

Let's alter the above example such that both tables have an AID to make this attack work. TL;DR: The difference is now between 6 + noise(2) in the first query, and 10 + noise(2) (and presumably extra noise layers?) in the second query.

The tables would be as in the original example, only with product_id being an AID as well.

Clients:

id ssn (AID)
1 a
2 b
3 c
4 d
5 e
6 f

Purchases:

product_id (AID) client_id
1 1
2 2
3 3
4 4
5 5
1 6
2 6
3 6
4 6
5 6

Now the query select count(*) from purchases join clients on client_id = clients.id would yield:

Anonymized based on ssn:

ssn count Flattening by AID value
f 5 4
a 1
b 1
c 1
d 1
e 1

i.e. a count of 10 - 4 + noise(1) = 6 + noise(1)

Anonymized based on product_id:

product_id count Flattening by AID value
1 2 0
2 2
3 2
4 2
5 2

i.e. a count of 10 - 0 + noise(2) = 10 + noise(2)

The final anonymized result would be: 6 + noise(2)

For query select count(*) from purchases left join clients on client_id = clients.id and ssn <> 'f' would produce:

Anonymized based on ssn:

ssn count Flattening by AID value
a 1
b 1
c 1
d 1
e 1
unaccounted for 5

i.e. a count of 5 + 5 + noise(1) = 10 + noise(1)

Anonymized based on product_id:

product_id count Flattening by AID value
1 2
2 2
3 2
4 2
5 2

i.e. a count of 10 + noise(2)

The final anonymized result would therefore be 10 + noise(2)

from reference.

cristianberneanu avatar cristianberneanu commented on August 26, 2024

The example with a single AID was for illustrative purposes.
In many joins, there would be 2 or more AIDs, with the ones on the left side being non-null.
Plus, if we have an algorithm for handling null AID values, why wouldn't we use it consistently in all cases?

a condition or query construct that excludes the victim that isn't detected or compensated for by LED.

I don't think LED can compensate for not flattening outliers.

from reference.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.