usegraffy / graffy Goto Github PK
View Code? Open in Web Editor NEWLive queries for graph-shaped data
Home Page: https://graffy.org
License: Apache License 2.0
Live queries for graph-shaped data
Home Page: https://graffy.org
License: Apache License 2.0
Create a special "filtering link" which, when traversed, modifies the keys immediately under the link to add filtering parameters.
Consider the schema
{
posts: { [pid]: Post },
posts$$createdAt: { [filter]: { [createdAt]: link(`/posts/${pid}`) } },
users: { [uid]: User }
}
where the posts$$createdAt
index can be filtered by authorId and tag.
Imagine we want to query last 3 posts of a user, with a particular tag, alongside their name. While this is possible already with a query containing a users
branch as well as a posts$$createdAt
branch, such a query would be unintuitive, duplicate userIds, and require additional post-processing of results.
Ideally this query should work:
# Query
{
users: { '123': {
name: 1,
posts: { [key({ tag: 'example' })]: [{ last: 3 }, {
title: 1, createdAt: 1
}] }
} }
}
and Graffy should send the following query to the posts$$createdAt
provider:
{ [key({ tag: 'example', authorId: '123' })]: {
title: 1, createdAt: 1
} }
This could be done if the user provider returned a "filtering link" for the posts
property:
{
name: 'Example',
posts: link(['posts$$createdAt', { authorId: '123' }])
}
It's more useful, more succinct, and protects the user from having to deal with \0
s and \uffff
s
The change should happen in decorate.js
Instead of:
arr.pageInfo = {
hasNext: false,
hasPrev: true,
start: '',
end: 'foobaq\uffff'
}
we should have:
arr.prevRange = null;
arr.nextRange = { first: 10, after: 'foobar' };
The null
prevRange indicates that this is the first page. The first
/ last
should match the current page size, and before
/ after
should use the keyAfter
/ keyBefore
helpers from @graffy/common
.
Currently, every node has a version value. In practice, in most (but not all) graphs and queries, all nodes have the same version. There is some redundancy here.
In subscription caches, we need to update the version number of the entire cache whenever there is a new update. With the current data structure, this takes O(size of cache) time, while other operations only take O(size of change).
It might be beneficial to rethink how version is stored and manipulated in the internal representation.
Version storage might be simplified by only having one writeVersion and one readVersion per tree, but to pack multiple trees into "layers".
To stop this getting out of hand, queries are immutable and can only have one layer - so all parts of a query must have the same min version requirement. When merging queries, we can take the max(readVersion) and min(writeVersion) to ensure that the data required by all constituent queries are requested.
This is inextricably linked to #2 .
Primary goal: Add typings to the published NPM modules.
Secondary goal: Get type checks into the development workflow for Graffy itself.
The preferred approach is to use JSDoc-style function annotations (that TypeScript supports) rather than converting to Typescript syntax.
The consumer APIs (which change to read, write and watch) could gain a path argument to avoid having to implement aliases.
By and large, Graffy encourages granular queries; if a component has the sort of data need that requires aliases, it might be better served by just making two queries.
However using dynamic keys in queries comes with a bit of boilerplate that could be eliminated.
const postId = get_post_id_somewhere();
result = await gs.read({ posts: { [postId]: { ... } });
const what_i_really_want = result.posts[postId];
It feels even worse when using filter parameters:
const filter = encodeKey({ tags: ['tech', 'javascript'] }); // This is some opaque string.
result = await gs.read({ filteredPostsByTime: { [filter]: [ { first: 10 }, {...} ] });
const what_i_really_want = result.posts[filter];
I have to store the encoded filter into a variable even though it has no meaning or use outside that query.
I feel that a better API might be:
const postId = get_post_id_somewhere();
const just_the_post = gs.read( ['posts', postId], { ... });
or with the filter:
const filteredPosts = gs.read([ 'filteredPostsByTime', encodeKey(...) ], [ { first: 10 }, { ... } ]);
What say?
In read/write/watch, we would wrap the query in the path before passing to .call(), and unwrap the results before returning.
Consider a watch query:
{
users: [{
name: true,
email: true
}]
}
Currently there are two modes for this watch: "values" mode, where every response will contain all users, and "raw" mode, which will contain only the changes. A common use case is for a "raw+" mode where you receive only the changed users, but for a particular user that changed both name and email is received (although only one of them has changed).
This is convenient for watching processes that would otherwise need to watch changes and then load every entity.
An index provider might be able to retrieve the necessary information at the link, not just its location. Allowing the provider to do, for example:
store.onRead('/posts$', query => {
const posts = getPostsFromDb(query);
return _.fromPairs(posts.map(post => [
key([post.createdAt, post.id]),
link(`/posts/${post.id}`, post),
]));
})
Hello, it seems a js.org
subdomain that was requested to target this repository no longer works.
The subdomain requested was graffy.js.org
and had the target of aravindet.github.io/graffy
.
It produced the following failures when tested as part of the cleanup:
To keep the js.org
subdomain you should add a page with reasonable content within a month so the subdomain passes the validation.
Failure to rectify the issues will result in the requested subdomain being removed from JS.ORGs DNS and the list of active subdomains.
If you are wanting to keep the js.org
subdomain and have added reasonable content, YOU MUST reply to the main cleanup issue with the response format detailed at the top to keep the requested subdomain.
๐ค Beep boop. I am a robot and performed this action automatically as part of the js.org cleanup process. If you have an issue, please contact the js.org maintainers.
This is a nice-to-have for 1.0.
As a prerequisite, we should add a soft convention for naming indexes, e.g. as '$<index_name>`, then we can do:
/posts?by=time&first=10&fields=slug,title,at,authors(first:1,name,avatar)
should become:
{
'posts$time': [ { first: 10 }, {
slug: 1, title: 1, at: 1,
authors: [ { first: 1}, {
name: 1, avatar: 1
} ]
} ]
}
GET /posts/123?fields=slug,title,at,author(name,avatar)
should become:
{
'posts': { 123: {
slug: 1, title: 1, at: 1,
author: { name: 1, avatar: 1 }
} }
}
TL:DR; Replace watch() with incremental read() polling
The current implementation of watch() is complex to implement in providers and doesn't support back-pressure or resumption.
https://repeater.js.org/docs/repeater
This can be a replacement for @graffy/stream (which can then be deprecated) and mergeIterators. mapStream can also be replaced with an async generator.
In the subscription provider of the example mock visitor list, pushing the initial state (rather than undefined) should improve performance slightly by not requiring a separate get. However it looks like it reduces performance drastically.
Requires investigation.
When using final mode cache:
{ foo: { "1": "34" } }
the query:
{ foo: [ { first: 3 }, 1 ] }
returns
{ foo: [ null, "34" ] }
(roughly).
TL:DR; Some watch() providers may handle { after: '', before: 'b' }
but not { first: 15 }
. How do they comunicate this?
Graffy providers often have limitations around what queries they can fulfil. They need to be able to signal these limitations, so graffy-fill can figure out ways to work around them.
Currently, we use some ad-hoc mechanisms to signal limitations. Perhaps we could design these in a more systematic way.
Consider the posts
and users
example. Let's say the posts resolver cannot fetch user data - if author
info was requested, it ignores the nested fields and simply returns a link as the author
field.
Graffy-fill makes a new (live) query for the linked data.
Imagine a subscription provider that can provide change streams but not the initial result (current state). It signals this by yielding undefined
as the first value.
Graffy-fill makes a separate fetch to get the initial value.
Imagine a change stream provider pushing updates for users
. Say it does not have access to the current state, but can access an event stream of user updates where each update specifies the user_id
.
Say the query is for the first 30
users.
In a scenario where there are thousands of users, MOST user updates will be irrelevant for this query. However, there is no way for this provider to know that, because it cannot know the range of IDs that match "first 30".
Perhaps there should be a way for the provider to signal that it cannot serve "counted" pages (i.e. that use first / last parameters) but can serve "bounded" ones (i.e. those that only have before AND after, but no first / last).
Graffy fill could use the fetch results to convert a "counted" page into a "bounded" one.
NOTE: If the pagination happens in an "index" (nodes where all the children are links), it will work fine if the change stream provider ignored the bounds queries and just pretend like there are no updates. However it seems like this is just working "by accident".
undefined
first.@baopham Thread to discuss what sort of APIs the query
object should have to make it easy for providers that might want to (1) construct a query, like SQL or ES (2) identify topics to subscribe to.
Say you want to write a provider /users
that needs to serve both queries like:
// 1
{ users: [ { first: 10 }, { name: true } ] }
// 2
{ users: { user_id_1: { email: true } } }
The provider might need to construct SQL queries:
# 1
SELECT name FROM users ORDER BY ID ASC LIMIT 10;
# 2
SELECT email FROM users WHERE id="user_id_1";
How would the "ideal" code to get from the query objects to the SQL look?
The pure JS "porcelain" query format currently in use is fairly verbose. This is a proposal to mitigate that with a Graffy query language. It aims to be similar enough to GraphQL to be familiar for those using it, but is not necessarily compatible with it.
Here is an example query:
{
books {
( tags: {foo, bar}, publishedUntil: '2000-01-01' ) [
( first: 10, after: ('1998-03-23', 4398) ) {
author {
name
photo
}
title
cover
description
}
]
}
}
which is equivalent to the current porcelain:
{
books: {
[key({
tags: {foo: true, bar: true},
publishedUntil: '2000-01-01',
})]: [
{
first: 10,
after: key('1998-03-23', 4398),
}, {
author: { name: true, photo: true },
title: true,
cover: true,
description: true,
}
]
}
}
The transformations (to the current porcelain structure) are quite straightforward:
(foo: 1)
becomes key({ foo: 1 })
('foo', 'bar')
becomes key(['foo', 'bar'])
{ foo, bar }
becomes { foo: true, bar: true }
before
, after
etc. within [...]
get collected into an object,
and :
are added as neededCurrently, graffy fill makes extra queries for subscriptions when resolving links. However, it does not clean those up when the link is updated.
This is currently planned to be fixed by extending slice() to return extraneous as well.
Currently, the device timestamp is used blindly. This is not resilient to timestamp decreasing (due to adjustments etc) and duplicate changes within 1ms.
We need to append a sequence number, remember the last used version, and use the last version with incremented sequence number if the timestamp is unchanged or has decreased.
This change should be made in the graph builder.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.