privacycg / storage-partitioning Goto Github PK
View Code? Open in Web Editor NEWClient-Side Storage Partitioning
Home Page: https://privacycg.github.io/storage-partitioning/
Client-Side Storage Partitioning
Home Page: https://privacycg.github.io/storage-partitioning/
It is unclear exactly where to post this to since it is more of a scenario than a comment directly on storage-partitioning but I am picking this repository because I have to pick one. I have tried to outline our scenario, the expecations of our customers and end-users and the impact of all of the options currently on the table. Hopefully this issue can be used to have a discussion and I don't have to post on all the repositories?
Tableau is a service which can run as a SaaS offering or via an on-premise, customer managed server installation. Tableau uses cookies primarily for session management. After a user logs in to Tableau, a session is created and a cookie is used to maintain this information (along with a csrf cookie used as part of CSRF protection). One of our primary (not unique) use cases is via embedded analytics where our visualations and experiences are embedded as a component (iframe) inside of a customer site or a third-party application. When cookies stop working, our ability to maintain sessions is broken and our user experience most often degrades into an endless login loop. We hit similar issues during the fixes for SameSite attribute enforcement (which just required updating everywhere we generate cookies) but are trying to make sure that the current proposals take into consideration our customer needs.
We have broken down options going forward and provided insight into each as well as a look at how proposals from the committee might affect them.
We have data that we can share which breaks down the impact of the currently available implementations and our testing of them across different browsers. Our hope is to work / communicate with the working groups to understand what the expectation so that we can continue to meet the requirements of our customers and end users.
I noticed that there is a behavior gap between WebKit and Gecko on partitioning SessionStorage. Is there a better option between the two we can align on here?
It may be useful to note that this came up while we were prototyping having Storage not un-partition with the Storage Access API. A major IDP uses SessionStorage and relies upon it not being partitioned in some use cases. That means that it is broken with the combination of always partitioned Storage and partitioned SessionStorage.
https://github.com/kyraseevers/Partitioning-visited-links-history makes a compelling case.
cc @kyraseevers @arturjanc @bvandersloot-mozilla @johnwilander
So, First-party sets is one of the approaches to solve Third-party cookies issues. Is there a way to make it work for the cases when local storage is used for communication between trusted websites from different domains? For example by making LS unpartitioned, but allowing access to it if domains are specified in the FPS relation. We have quite a complex communication mechanism in place, that relies on Local Storage and StorageEvents, that was impacted by Third-party storage partitioning rollout, and it would take quite a lot of effort to refactor it. Also, I don't see any other option except for using backend for this kind of communication now.
Also it seems that Storage Access API works only with cookies, maybe it can be applied to other storage types too?
Exposing whether an environment is partitioned, mainly through an HTTP request header, came up in the last cookie discussion privacycg/meetings#19 (again). There's a couple different ideas floating around addressing various use cases around security and developer ergonomics.
#31 and #25 also relate to this in that for cookies people have suggested a different keying setup, which really drives home the point that we have to be very careful with what we end up doing in this space.
I think having an equivalent to Sec-Fetch-Site
that tells you something about your ancestor documents (none
, same-origin
, same-site
, or cross-site
) still makes a lot of sense. However, in a A1 -> B -> A2 scenario this header would signal cross-site for A2, which might not make it clear enough it can still set SameSite=None
cookies (depending on how #31 gets decided). It would indicate that CHIPS cookies would work however so maybe that is good enough. (The main alternative I can think of is that we'd expose a separate "what is my site relation with the top-level" header, but I'm not convinced that carries its weight.)
I think this is working as intended, but the examples on https://developer.chrome.com/en/docs/privacy-sandbox/storage-partitioning/ don't really fit my example so I'll outline my use case and you can let me know if this is intended. TL;DR; - attempting to retrieve an auth token from our auth site returns undefined when trying to access localStorage from an iframe from auth site
localStorage.set('token', 'a.a.a')
localStorage.get('token')
If third party storage partitioning is off, then app.example.com receives the token correctly in Step 7.
In practice, both the auth.example.com and app.example.com are on the same domain so we don't actually run into this problem (the token is found correctly in Step 4). However, when developing locally we use "localhost" for "app.example.com" and it is in local development where this issue is happening. Has any consideration been given to exclude "localhost" from these rules?
Currently there is an affordance in place for extensions so that they can embed frames with web origins in extension pages, which will then be treated as first-party. (Reference)
The current affordance however requires an extension to have host_permissions
over the web origin.
If the web origin belongs to the extension author, in most cases it wouldn't need or request host permissions since it can directly communicate with the page using sendMessage
having declared it as externally_connectable
in its manifest.
Having minimal permissions in this case harms the experience since the scenario doesn't fit into the current affordance.
Q: Can we consider extending the affordance to consider frames first party on extension pages if the extension has the embedded webpage origin declared as externally_connectable
in its manifest?
storage-partitioning will break some web-platform-tests, we should take steps to fix affected tests.
The first one I've met is setcookie-lax.https.html.
Back when WebKit considered whether or not to implement Clear-Site-Data, we noted that clearing partitioned data upon receiving that header can be used for cross-site tracking purposes. Since not many others were considering partitioned storage at the time, we never filed issues about it, at least not that I'm aware of.
The attack is about one first party site having control over website data under another first party site.
Imagine site.example registering these 33 domains: haveSetPartitionedData.example and bucket1.example through bucket32.example.
site.example runs script in the first party context on a great many websites. As part of its execution on those sites, it injects 33 invisible iframes for the domains mentioned above.
Let's say site.example is executing its script on news.example. If a cross-site user ID has not yet been planted yet for news.example, the haveSetPartitionedData.example iframe will not have website data yet and communicates to the bucket1.example through bucket32.example iframes to start fresh. The bucket1.example through bucket32.example iframes all store '1' in their partitioned storage and report back to the haveSetPartitionedData.example iframe when they are done. Now the haveSetPartitionedData.example iframe stores the fact that 32 '1's have been stored in the news.example partiton.
Every time the user visits site.example, site.example gets to see its unpartitioned cookies which identifies the user. Let's say it uses a 32-bit ID for the user. It now makes sure to send Clear-Site-Data response headers matching the '0's in the unpartitioned cookie ID for the corresponding bucket domains. For example, let's say the user ID has '0's in bit 4, 6, and 20. Then site.example would make sure website data is cleared for bucket4.example, bucket6.example, and bucket20.example.
Now when the user visits news.example, the haveSetPartitionedData.example's iframe will have website data set and communicates to the bucket1.example through bucket32.example iframes to report their '1's and '0's (no website data means '0') to the site.example script on news.example.
Voilà, cross-site user ID established.
Only accepting Clear-Site-Data from the current first party website would mitigate this attack but not fix it. Further, if this attack is combined with browser/device fingerprinting, it only needs to add enough cross-site bits to reach ≈32 bits in total.
@krgovind at the last Privacy CG call you floated an idea around popups. Whereby you could open a popup and get a handle to it, but the popup would end up being partitioned in some way. I was wondering how serious that idea was as there are other proposals around popup handling and I wonder to what extent they should be pursued jointly.
cc @hemeryar
EDIT: We published an explainer expanding on this idea: https://github.com/DCtheTall/CHIPS/
During the CG meeting today, the topic of partitioning cookies came up.
@annevk mentioned that Firefox is currently experimenting with this. Also see his previous comment.
@johnwilander previously wrote that Safari attempted this change and rolled it back due to a couple of concerns that are broadly relevant:
Both of these issues might be alleviated by using an opt-in model for partitioned cookies.
One potential solution is to have the developer specify a cookie attribute PerPartition
(name needs bikeshedding), that is parsed in embedded/third-party contexts:
Set-Cookie: SID=31d4d96e407aad42; Secure; HttpOnly; PerPartition
The browser then stores that cookie in a partition keyed on (top-level-site, embedded-site)
Subsequently, when the browser makes a request to the embedee, it includes a cookie header with only the opted-in cookies and a header to indicate the top-level site:
Cookie: SID=31d4d96e407aad42
Sec-TopLevelSite: https://toplevel.site
Note: The question of whether it is acceptable to expose the first-party to a partitioned third-party is being explored in #14
During the CG meeting there was a question whether the first party location should be exposed to third parties (both via HTTP and JavaScript). And some agreement that it might make sense, modulo referrer policy.
Chrome now blocks third-party storage in incognito mode.
I believe Firefox blocks third-party storage for sites on the tracking list.
I don't know what Safari does today.
I don't know what Edge does today.
It's obviously much easier to simply throw on third-party storage access and then fill in unpartitioned storage once requestStorageAccess resolves. Do we have good reasons not to simply do that? Or perhaps we could provide a single partitioned storage mechanism, but not all of them.
We should do it in such a way that end users can still open them in the address bar though. And perhaps they should force COOP.
As discussed recently, there are various properties of the SameSite cookie attribute that need to be evaluated for how they would work in a partitioned world without third party cookies. A probably incomplete list of things I've seen mentioned:
Underlying is the question of what the SameSite attribute itself should look like in the future. We could, for example, decide to deprecate the attribute entirely and use alternative attributes to preserve aforementioned security-related use cases with more granular control.
Our web application has a nested document structure, A1->B->A2.
<html>
<body>
A1
<iframe src="tableau cloud URL">
B
<iframe src="Same domain as A1">
A2
</iframe>
</iframe>
</body>
<html>
A1 and A2 are contents we created on AWS, and they are within the same domain. We use AWS Cognito for user authentication and store access tokens in the browser's session storage in A1.
B is a page on Tableau's cloud.
A2 is an HTML from AWS embedded in B, and it calls the REST API we provide on AWS using JavaScript.
In this call, we set the access token that A1 saved in the session storage in Authorization.
With StoragePartitioning enabled, A2 cannot access the access token from the session storage, and the REST API from A2 can no longer be called.
Authentication using AWS Cognito and saving to the session storage are done using libraries provided by AWS, and the display in B or A2 uses features provided by Tableau, so the only part we can program is within the JavaScript in A2.
Could you please provide a way in the JavaScript within A2 to reference the session storage saved in A1?
For a site the user has added as a registered protocol handler for a safelisted scheme or web+
custom protocol, Storage partitioning, if it separates the handler site from its main storage (e.g. IndexedDB) will break the use case of loading a registered protocol as an iframe's src to establish a protocol-based app-to-app API channel.
To understand the use case, consider an example, web+wallet
, wherein a user has added a site as their web+wallet
handler. The web+wallet
community ships a small lib to create a frame that loads the web+wallet
protocol in an iframe's src, allowing top-level site to interact with whatever site a user has installed as their web+wallet
handler, via the postMessage API conduit. It is important we not break this functionality for frames loaded with custom protocol handler pages, as this is the only means installed handlers have to provide a background process/API channel to sites that integrate support for them.
Recommendation: because registering a protocol handler already requires an explicit top-level visit to the domain of the registered site + the direct, overt, explicit user choice to install a site as a handler, custom protocol frames should be exempt from partitioning.
As per the ongoing discussion on PSL in privacycg/private-click-measurement#78, it's become apparent that a domain present on the PSL can still be loaded within a browser. This has been tested across Safari, Chrome, Firefox with consistent results - the PSL domain will load and be rendered in the browser.
The example referenced in the other issue is http://gov.au, which is on the PSL and is a static holding page for the Australian government. You'll note that the browser will load this page and cookies can successfully be set for this domain, potentially causing scoping issues for subdomains that should probably be treated independently of the parent domain.
This is a security issue, especially when many of the proposals like the linked one rely on cookie separation as part of the set of privacy guarantees.
We should discuss how to resolve this.
Currently, it is possible for standard browser navigations and JavaScript based fetch/XHR calls to hit the same HTTP cache. This may not be true as caches are partitioned further in the future. However, there are advantages to allowing these network requests to hit the same cache, in particular for same-origin applications.
Implementation wise, this can be utilized if the website sends responses in a "polyglot response" format that is both well-formed HTML, and parseable by JavaScript to extract HTML chunks, or extract structured data to utilize with client-side rendering.
Formalize circumstances under which the HTTP cache is shared with fetch/XHR for same-origin requests, even as further partitioning occurs. This may be automatic by convention, or may require specific parameters opting in to the behavior for fetch/XHR to utilize the HTTP cache instead of a separate cache. For example, using { mode: 'same-origin' }
for fetches.
Consider a hybrid app that prerenders the initial page and sends complete HTML to the client, then loads subsequent pages using JavaScript, fetch/XHR, and the History API. Each page contains dynamic content, e.g. the results of a search query.
First, the client initiates a client-side navigation to a new page which destroys the initial page content. Next, the user hits the back button to return to the initial page. A new network request must be initiated from JavaScript to fetch the dynamic content from the server.
If the fetch/XHR request can call the initial URL, hit the HTTP cache, and extract the needed dynamic content, it can avoid this network request, server computation, and added latency. This is similar to a fully SSR experience - a back button in this scenario would hit the HTTP cache from disk without an additional network request.
While there are other solutions to this like storing the initial page content in memory or other storage like SessionStorage, these have their own downsides, and also do not help with the reverse situation:
Consider the same hybrid app. Now, the user navigates client-side one or more times, then clicks a link to an external site. The user then hits the back button to return to the hybrid app, and BFCache misses. In a fully SSR app, the back navigation would again instantly restore the page from disk cache without a network request. In the hybrid case, we will experience a cache miss since the URL was originally fetched via JavaScript, and is now fetched via a browser navigation.
However, if we are using the polyglot response approach and fetched/XHRed the same URL that was pushed to the history stack, it will already be in the HTTP cache and the back navigation will be performed instantaneously without an additional network request.
Even utilizing custom caching in-memory or with other storage, this case can't be solved for browser-based navigations without an additional network request.
Below are example flows performed in desktop Chrome showing how this works today.
await fetch('https://www.google.com/search?q=test+query+1', {mode: 'same-origin', cache: 'only-if-cached'})
.document
and has a size, meaning it downloaded from the server.await fetch('https://www.google.com/search?q=test+query+1', {mode: 'same-origin', cache: 'only-if-cached'})
.fetch
loaded from (disk cache)
.(disk cache)
.(disk cache)
.Empty Cache and Hard Reload
.Empty Cache and Hard Reload again
.await fetch('https://www.google.com/search?q=test+query+2')
.Currently service workers have poor SameSite cookie protections because its "site for cookies" is simply set to the origin:
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis#section-5.2.2.2
In contrast, documents take into account the top-level-site and the ancestor chain when computing "site for cookies":
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-rfc6265bis#section-5.2.1
This is problematic because it means adding a service worker to a site can reduce the safety of SameSite cookies.
With storage partitioning we have the opportunity to fix this. We already plan to include top-level site in the storage key which will allow us to include it in the "site for cookies" computation for service worker. We lack any ancestor chain information, however.
The ancestor chain is important for "site for cookies" because it helps protect against clickjacking attacks. To extend this protection to service workers we propose:
Include a "cross-site ancestor chain" bit in the storage key. This bit would be true if there are any sites between the current context and the top-level context that are cross-site to the current context. So it would be true for A -> B -> C or A -> B -> A. It would be false for A -> A or A -> B.
With this bit in the storage key it would permit us to compute a "site for cookies" value for service workers that is equivalent to any document controlled by that service worker.
This was discussed at the recent service worker virtual F2F: w3c/ServiceWorker#1604.
Still available in Chromium:
https://en.wikipedia.org/wiki/Web_SQL_Database
If a website uses iframes that are not sameorigin but still controlled by the same authority, wouldn't it make sense to have a way to disable storage partitioning ?
Not having any way to disable it forces you to ask for user gesture on the iframe for that iframe to have access to APIs like a service worker.
It would be the same kind of way than CSP or CORS, defining explicitly the domains that the embedder and the embedded accepts.
I've not seen anything regarding a way to disable partitioning after looking through the issues / docs.
This issue ties into the Storage Access API, but is also relevant here as it has implications for the architecture of the affected pieces of state.
Firefox has an implementation where Cookies and all of Storage go between having additional keying and not having additional keying depending on the Storage Access API.
Safari allows Cookies to go between blocked and not having additional keying (i.e., first-party access).
I'm not aware what Chrome is planning here.
Here's a list of additional things that are isolated by privacy.firstparty.isolate
in Firefox and Tor Browser:
Is there an approach to programmatically check if storage partitioning is enabled or disabled now?
Assume a document nesting scenario of A1 -> B -> A2 whereby A1 and A2 are same-origin with each other and cross-site with B. In the real world this sometimes materializes as a publisher embedding an ad distributor that then decides to display an ad from the publisher.
As discussed in #25 and elsewhere it's generally considered good practice for A2 to be severed from A1 to avoid confused deputy attacks, which is why browsers are considering adding the "has cross-origin ancestors" bit to the partitioning key.
Now unlike other state, cookies have the unique ability to indicate these confused deputy attacks are defended against through the usage of SameSite=None
. As such, the argument has been made that sending unpartitioned cookies to A2 is okay, as long as they use SameSite=None
.
This creates some weirdness in that from a theoretical perspective B and A2 should not really be any different in terms of their relationship with A1. As in, both of them are partitioned. However, given the existing use cases and the unique ability of cookies to indicate confused deputy attacks were considered upon creation (to be clear, I somewhat doubt web developers consider that in detail, they also just want things to work) it might be acceptable to privilege A2.
Alternatives:
SameSite=None
does not have special privileges that allow it to ignore the "has cross-origin ancestors" bit when setting the cookie. (This would be my personal preference as this kind of logic where we only look at part of the total key seems rather scary.)SameSite=None
already indicates a disregard for security.(We discussed this scenario as part of privacycg/meetings#19.)
The storage of a top-level frame is keyed by just its origin, while storage for a subframe is keyed by at least its own origin and the top-level origin. Intuitively, we often talk about the situation of being keyed by just one origin as having access to "first-party storage", but that's not really defined anywhere, and I don't know of shared terminology for subframes' keying situation.
This explainer should say how other specifications should describe the various situations. It should probably also eventually define ways for other specifications to define the storage access of their own environment settings objects, but that seems farther away.
I think there's roughly two definitions of third party that are important for the web platform:
Potential usage in prose if we want to formalize these as terms rather than using the longer phrase: If settingsObject has a third-party origin, then ...?
There's an interesting thing that @bakulf pointed out to me which is that cookies have their own definition of this concept and that considers the entire ancestor chain. So when example.com/1
embeds thirdparty.example
and that embeds example.com/2
per the above definitions /2
would not have a third-party origin/site, but at the same time it would not get SameSite cookies.
This does not seem hugely problematic to me and I don't think we can/should really change either definition at this point, but it's worth keeping this in mind.
Mainly wanted to write this down here to ensure we actually have agreement on this as we often say third party without being concrete about it.
cc @clelland
From https://bugzilla.mozilla.org/show_bug.cgi?id=1495241#c1 (more context at https://privacycg.github.io/storage-partitioning/):
A problem with isolating service workers is that they are somewhat intrinsically linked to globalThis.caches
aka the Cache API, a typical origin-scoped storage API. And that in turn is expected to be the same as localStorage
or Indexed DB as sites might have interdependencies between the data they put in each.
Possible solutions:
Based on this I still favor 2, but 3.2 is also interesting.
cc @andrewsutherland @jakearchibald @inexorabletash @jkarlin @johnwilander
This is related to #7.
In particular if you allow the Storage category to have its keying relaxed, there's an argument to be made that BroadcastChannel and shared/service workers ought to be blocked rather than have additional keying as sites could end up in a state where they have both third-party and first-party BroadcastChannel, for instance. And they cannot really be told apart either other than the site knowing when it allocated them relative to its current Storage Access API state.
Note that it's not a good solution to let part of the Storage category have its keying relaxed and part of it not. Sites often use multiple storage APIs for various bookkeeping purposes. Making their data inconsistent with each other is bad news. Blocking on the other hand doesn't really have that problem and might even be doable given that BroadcastChannel and shared worker are not supported by Safari.
Effectively this is a variant of the issue with same-origin frames having synchronous communication access being able to end up in different states. (Though we made a decision there to not let that happen.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.