Comments (3)
Working out scenarios in detail
Let's say we have 2 paths, and each has a history of 3 documents (o
)
/path1 o o o
/path2 o o o
^ heads (latest versions in each path)
We do a basic query that matches some of them...
/path1 x x o // "x" matched query, "o" did not
/path2 x o x
Here are ways we might want to filter further with a fancier query:
// [ ] means it was returned by the fancy query
/path1 x x o // A. only heads
/path2 x o [x]
/path1 x x o // B. only heads, then expand to entire history
/path2 [x] [o] [x]
/path1 x [x] o // C. latest individual match per path
/path2 x o [x]
/path1 [x] [x] o // D. all individual matches
/path2 [x] o [x]
/path1 [x] [x] [o] // E. every path that matched somehow. return all history.
/path2 [x] [o] [x]
/path1 x x [o] // F. every path that matched somehow. only return the head.
/path2 x o [x]
Use cases
Why do all these things?
Some apps will be doing custom conflict resolution so they'll generally want full histories. Some apps will just use the simple last-write-wins that's built into Earthstar.
Besides in-app searching and filtering, these might also be useful for sync queries.
Here's some examples for a wiki app, searching for pages authored by me:
- A: Pages where I'm the most recent editor (current version)
- B: Pages where I'm the most recent editor (plus full edit history)
- C: Edits I made (my most recent version per document)
- D: Edits I made (all my revisions)
- E: Every document I've touched (plus full edit history)
- F: Every document I've touched (current version only, even if not by me)
Which queries are fast vs slow?
Assuming we've already tagged the latest document with "isHead = true" in the database...
- A: ✅✅ document-wise + check isHead column
- B: ⌛ expand to all history (subquery or group-by?)
- C: ✅ group-by for latest, or iterate all matches & discard non-heads
- D: ✅✅✅ document-wise
- E: ⌛ expand to all history (subquery or group-by?)
- F: ⌛ expand to all history and group-by for latest
Note "head" means the latest overall, and "latest" means the latest of the matches.
The operations are:
- match: Do basic matching
- heads: Only keep heads?
- expand: Expand to all history?
- latest: Only keep latest doc in each path (might not be overall head)?
Which is:
- A: ✅✅ match, heads
- B: ⌛ match, heads, expand
- C: ✅ match, latest
- D: ✅✅✅ match
- E: ⌛ match, expand
- F: ⌛ match, expand, heads (or match, find-heads-for-each-path)
Turning that into query parameters
We could have a query parameter like this:
// in same order as above
historyMode:
'matching-heads'
| 'matching-heads-plus-all-history'
| 'latest-matching-versions'
| 'matching-versions'
| 'matching-versions-plus-all-history'
| 'any-heads-that-have-matches-in-history'
Or would it be better to break this into 2 or 3 separate query parameters?
from earthstar.
Besides authors, we can do other operations on the set of document versions for a given path. For example, timestamps:
Timestamps
/path/1: m n o // three versions
/path/2: p q r
- To get o and r, the most recent edits of each doc, is query type A (match, heads)
- To get m and p, the oldest edit of each doc (creation time), is query type C (match, oldest)
from earthstar.
In the beta
branch, v6, I decided what to do: asking for "latest" docs happens FIRST, and then filters are applied only to the results.
Using the language from previous comments above:
- { history: 'latest' } is type A -- get latest docs first, then apply filters to those.
- { history: 'all' } is type D -- all individual matching documents regardless of location in history.
Comments from the beta source code for further details:
https://github.com/earthstar-project/earthstar/blob/beta/src/storage/query.ts#L48-L57
/**
* Query objects describe how to query a Storage instance for documents.
*
* An empty query object returns all latest documents.
* Each of the following properties adds an additional filter,
* narrowing down the results further.
* The exception is that history = 'latest' by default;
* set it to 'all' to include old history documents also.
*/
export interface Query {
/**
* Document author.
*
* With history:'latest' this only returns documents for which
* this author is the latest author.
*
* With history:'all' this returns all documents by this author,
* even if those documents are not the latest ones anymore.
*/
author?: AuthorAddress,
https://github.com/earthstar-project/earthstar/blob/beta/src/storage/storageSqlite.ts#L272-L299
* If query.history === 'all', we can do an easy query:
*
* ```
* SELECT * from DOCS
* WHERE path = "/abc"
* AND timestamp > 123
* ORDER BY path ASC, author ASC
* LIMIT 123
* ```
*
* If query.history === 'latest', we have to do something more complicated.
* We don't want to filter out some docs, and THEN get the latest REMAINING
* docs in each path.
* We want to first get the latest doc per path, THEN filter those.
*
* ```
* SELECT *, MAX(timestamp) from DOCS
* -- first level of filtering happens before we choose the latest doc.
* -- here we can only do things that are the same for all docs in a path.
* WHERE path = "/abc"
* -- now group by path and keep the newest one
* GROUP BY path
* -- finally, second level of filtering happens AFTER we choose the latest doc.
* -- these are things that can differ for docs within a path
* HAVING timestamp > 123
* ORDER BY path ASC, author ASC
* LIMIT 123
* ```
from earthstar.
Related Issues (20)
- Upgrade SQLite3 dependency to 0.8.0
- Address private set intersection exploit HOT 2
- What is the rationale for the generateAuthorKeypair shortname only allowing lengths of 4 chars? HOT 1
- npm install error HOT 1
- Adding denoKV to doc drivers HOT 4
- ReplicaCache#onCacheUpdated not triggered until you call cache.queryDocs()? HOT 3
- Error compiling typescript file that imports earthstar npm package
- ExtensionKnownShares loading known shares from memory HOT 1
- Export a browser entrypoint with IndexedDB StorageDriver
- Port Discovery API + DiscoveryLan
- Add way to check if Auth has been set up yet
- Add way to check if given password is correct on Auth
- Auth should be an EventTarget and emit events
- Peer should be an EventTarget and emit events
- IdentityKeypair + ShareKeypair secretKey property should be Base32 encoded string
- "All-in-one" authorisation
- Lots of duplicate auth stuff in the store HOT 5
- Filter sparse documents from store results unless specifically requested
- Cinn25519 shortname encoding is incorrect
- TypeError: The input must be a Uint8Array, a string, or an ArrayBuffer. Received a value of the type undefined. HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from earthstar.