Comments (15)
Many good replies here but I will add my solution too:
const ensureOrder = options => {
const {
docs,
keys,
prop,
error = key => `Document does not exist (${key})`
} = options;
// Put documents (docs) into a map where key is a document's ID or some
// property (prop) of a document and value is a document.
const docsMap = new Map();
docs.forEach(doc => docsMap.set(doc[prop], doc));
// Loop through the keys and for each one retrieve proper document. For not
// existing documents generate an error.
return keys.map(key => {
return (
docsMap.get(key) ||
new Error(typeof error === "function" ? error(key) : error)
);
});
};
Usage
new DataLoader(async ids => {
const users = await getUsersByIds(ids);
return ensureOrder({
docs: users,
keys: ids,
prop: "id",
error: id => `User does not exist (${id})`
});
});
Field will be null
if there is no document matching key/ID.
You can also pass error function/string to generate custom error that will be sent back to the client in the errors
field.
from dataloader.
We've faced similar issues with sorting based on keys of multiple types. The risk of returning incorrect data for the wrong key is too high to not be bullet proof. We've built on the previous examples and built out a utility function with tests that may be useful to others. Keys can be in the form of strings, numbers, or objects.
https://github.com/reachifyio/dataloader-sort
We also write tests for all our our dataloaders just to be 100% sure.
from dataloader.
I do it in Javascript like this:
static findByKeys(keys) {
return db.select(...this.fields)
.from('table_name')
.whereIn('id', keys)
.then(rows => keys.map(k => rows.filter(r => r.id === k)[0]));
}
from dataloader.
What about using an object for the batch return: { [key:any]: any }
, that is, when you return you return a hash of values keyed by their ID.
Granted, this may be less performant in cases where you can rely on your database to return a sorted set of values, with undefined values representing values that don't exist.
from dataloader.
I use:
db(tableName)
.whereIn("id", ids)
.orderByRaw(`FIELD(id, ${ids})`)
.then(this.normalize)
But I'm wondering the same thing: how to maintain key order for when the new results are merged with existing.
from dataloader.
From my point of view, I think there has been sufficient discussion in this issue to close it but I'll leave that up to leebyron.
I think the reworked documentation is sufficient to at least push people in the right direction and the suggestions provided have given me some reasonable options.
For me, any of the below will work fine for sorting
- writing my own helper method
- finding an NPM package
- leveraging database sorting as suggested by ericclemmons
Thanks all for your helpful and detailed comments
from dataloader.
Here is a helper function made by me and @jorgecuesta which we think could be useful for people facing the same problem. It also has better performance compared to the above sort solution.
Here is the gist: https://gist.github.com/JeffreySoriano5/05e8c66f3b1fbec1d20403a774b0c15d
Hope someone finds it useful.
from dataloader.
Great Observation! I also ran into this problem. I am working with MySQL, GraphQL and Sequelize and realized that when using dataloader, the key to the value returned from the promise has to be in sync or else the results are incorrect.(Not to mention the length of the Array has to be maintained with the same number of returned values)
Also faced a lot of problems with JOINS as the order was pretty random. It was totally dependent on the type of JOIN that I had marked. Later I realized that in some I needed INNER JOIN and in some LEFT OUTER JOIN. Finally I realized that the problem is not with the JOINS but how Dataloader treats the batch load function.
@leebyron If we could have a predictable behavior with respect to some sorting in place it would be a boon. This is my opinion.
@Solution
Currently I have used a helper function to maintain the output from the Database promise so that the returned values are maintained in the same order as the index of the input array and also maintain the number of returned values.
Apologies if this sounds confusing.
Look forward to some direction in this regard.
Thanks
from dataloader.
Thanks for the report. I just added a section in the Readme that made the constraints of the batch function more clear, hopefully that aids future confusion.
I took a look at your gist and it's definitely non performant, but I don't believe that it's inherent to the initial order of the batched keys, but just an artifact of the implementation you used.
Specifically the "push-filter" implementation is O(N^2) with respect to the number of keys being batched, it requires copying, and filter
is typically not optimized by the VM. As a result, it's not very fast compared to sorting. Also, because of the N^2 factor, if your set of batched keys is typically small, then even push-filter
could be fine. With your gist using 100 keys per batch (which seems pretty high to me) it's of course significantly slower, but with 5-10 keys, it's a little slower but not extremely so.
However that is also not the only possible implementation of re-ordering results, and others can perform much better, even beating that of sorting keys and values. I created a variation of your gist with one example (though certainly not the only possible one) which builds an index map to sort the values. It's a bit faster than simply sorting both.
However both of these techniques may fall short if your backend doesn't return anything for non-existent data.
Of course, if your backend returns results matching the order they were requested, then you wouldn't need to do anything at all.
I'm certainly open to ideas on how to improve the library, but the goal should be to provide as thin a solution as is reasonable so as not to create new performance burdens where they're not necessary and to keep the set of things you need to learn low.
from dataloader.
That might be tricky since not all keys can be used as object property names, for instance MongoDB keys are often represented as objects. Also, this return style wouldn't work for non-caching DataLoaders which could have duplicate keys.
However, I think there's ample npm territory for anyone to seek out a transform function that does this! It would only work in some circumstances, but it could be a nice utility for those cases
from dataloader.
@joanrodriguez Yes but then you would have to do that in almost every DataLoader which gets annoying pretty fast :|
@andrewreedy 's packaged solution could do the trick for now
from dataloader.
Great to see all the various approaches here! Ecosystems ftw!
from dataloader.
Many good replies here but I will add my solution too:
const ensureOrder = options => { const { docs, keys, prop, error = key => `Document does not exist (${key})` } = options; // Put documents (docs) into a map where key is a document's ID or some // property (prop) of a document and value is a document. const docsMap = new Map(); docs.forEach(doc => docsMap.set(doc[prop], doc)); // Loop through the keys and for each one retrieve proper document. For not // existing documents generate an error. return keys.map(key => { return ( docsMap.get(key) || new Error(typeof error === "function" ? error(key) : error) ); }); };Usage
new DataLoader(async ids => { const users = await getUsersByIds(ids); return ensureOrder({ docs: users, keys: ids, prop: "id", error: id => `User does not exist (${id})` }); });Field will be
null
if there is no document matching key/ID.
You can also pass error function/string to generate custom error that will be sent back to the client in theerrors
field.
Not sure why, but I had to toString() the key and the prop or else it keeps throwing me error.
from dataloader.
@qeehong does the error come from the ensureOrder
function or DataLoader
or just database? It's very unlikely that this code will have this kind of problem.
from dataloader.
@qeehong does the error come from the
ensureOrder
function orDataLoader
or just database? It's very unlikely that this code will have this kind of problem.
It's coming from the ensureOrder
. Or maybe it just me because I need to use ObjectId
for the key. Had to make following changes:
const ensureOrder = (options) => {
const { docs, keys, prop, error = (key) => `Document does not exist (${key})` } = options;
const docsMap = new Map();
docs.forEach((doc) => {
const stringKey = doc[prop].toString();
docsMap.set(stringKey, doc);
});
return keys.map((key) => {
const stringKey = key.toString();
return (
docsMap.get(stringKey) ||
new Error(typeof error === 'function' ? error(stringKey) : error)
);
});
};
from dataloader.
Related Issues (20)
- Move CI to GitHub Actions
- Setup publish token for CI
- Change default branch `master` to `main` HOT 1
- Setup Renovate Bot to keep dependencies up to date HOT 3
- [QUESTION] Why aren't keys being de-duplicated even when the cache is disabled? HOT 6
- ✨ [REQUEST]: Add examples for cacheKeyFn and cacheMap HOT 2
- [REQUEST]: How to pass auth headers to dataloader HOT 4
- [QUESTION] README.md doesn't show on npmjs.com HOT 2
- CacheMap.get calls should await the promised value HOT 1
- [QUESTION] Release changes? HOT 9
- [QUESTION] name is now required in options? HOT 4
- [BUG] Readme is missing from NPM HOT 2
- [BUG] `this` context not available in typescript definitions
- [BUG] Use caution with `jest.useFakeTimers()` HOT 1
- [REQUEST] support vercel edge functions by default HOT 4
- Ever considered the use of `queueMicrotask`? HOT 1
- Coalescing multiple load() calls HOT 3
- [REQUEST] Unify the way `load` and `loadMany` handle errors
- -
- [REQUEST] Add batch grouping
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dataloader.