mster / fireferret Goto Github PK
View Code? Open in Web Editor NEWNode.js Read-through cache for MongoDB ๐ฅ๐
Home Page: https://mster.github.io/fireferret/
License: MIT License
Node.js Read-through cache for MongoDB ๐ฅ๐
Home Page: https://mster.github.io/fireferret/
License: MIT License
Large batch hash set crashes Redis client.
RedisError
at RedisClient.multihset (/home/m/code/node/fireferret/lib/redis.js:151:13)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
PROPOSAL:
We schedule caching events (during a CACHE_MISS) to be done after the FireFerret client returns the original copy from MongoDB.
Nothing is free, so the next tick takes on the cache event work on top of normal request work. It's possible to track pending cache requests and prevent redundant cache misses on the same query by freezing any consecutive requests to the same resource while caching. We could even hold onto the data locally, and return early with our local copy.
Is a few milliseconds per cache miss worth the memory overhead? ๐ค
Example Error
{"level":50,"time":1596248117731,"pid":16023,"hostname":"ubuntu","reqId":11,"req":{"method":"GET","url":"/api/plants?q={%22plantName%22:%22Derek%22}&pagination={%22page%22:1000,%22size%22:50}","hostname":"127.0.0.1:3371","remoteAddress":"127.0.0.1","remotePort":40798},"res":{"statusCode":500},"err":{"type":"FFError","message":"Keys are required for multihgetall","stack":"InvalidArguments: Keys are required for multihgetall\n at RedisClient.multihgetall (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/redis.js:111:13)\n at cacheHit (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/client.js:86:39)\n at FireFerret.fetch (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/client.js:125:12)\n at processTicksAndRejections (internal/process/task_queues.js:97:5)\n at async Object.module.exports.getPlants (/home/m/code/planterdocs/fetch-api/src/handlers/plants.js:32:20)","name":"InvalidArguments"},"msg":"Keys are required for multihgetall"}
Whenever a query contains:
\w
The following scan operation done during a wideMatch attempt will not include said query.
We need every valid query to show up during a scan. The current guess is that the stringified escape characters mess with Redis' scan.
Currently pulling all query matched docs, regardless of pagination props.
We want to only fetch the page range's documents:
Implement pagination for the Mongo client, using either a bucket pattern or id sort.
AVOID USING SKIP-LIMIT IMPLEMENTATION (so heckin' slow w/ large data sets)
Support batch size for subdivision of Redis tasks.
Related to #1
Example Error:
{"level":50,"time":1596248052800,"pid":16023,"hostname":"ubuntu","reqId":7,"req":{"method":"GET","url":"/api/plants?q={plantName:%22Derek%22}&pagination={%22page%22:1,%22size%22:50}","hostname":"127.0.0.1:3371","remoteAddress":"127.0.0.1","remotePort":40790},"res":{"statusCode":500},"err":{"type":"SyntaxError","message":"Unexpected token p in JSON at position 1","stack":"SyntaxError: Unexpected token p in JSON at position 1\n at JSON.parse (<anonymous>)\n at Object.module.exports.getPlants (/home/m/code/planterdocs/fetch-api/src/handlers/plants.js:17:18)\n at preHandlerCallback (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:120:28)\n at preValidationCallback (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:103:5)\n at handler (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:69:5)\n at handleRequest (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:18:5)\n at runPreParsing (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/route.js:358:5)\n at Object.routeHandler [as handler] (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/route.js:334:7)\n at Router.lookup (/home/m/code/planterdocs/fetch-api/node_modules/find-my-way/index.js:351:14)\n at Server.emit (events.js:311:20)"},"msg":"Unexpected token p in JSON at position 1"}
We currently have 6 different option-subsets. Condense and clarify before v1.0.0
ISSUE:
{ pagination: { page: 1, size: 1 } }
Support streaming documents (from Mongo or Redis) using Newline Delimited JSON.
Example Error:
Cannot destructure property 'page' of 'options.pagination' as it is undefined.
Users could benefit from allowing for automatic retries with timeout.
PROBLEM:
We know a query has N results from the queryList. However, in some cases Redis document hash keys may expire, dropping the document from the cache.
To the user, this means they get < N documents back, even though they expected N.
PROPOSAL:
Allow users to enable:
(1) is fast*, while (2) is expensive yet dependable.
(*) the work still exists, but will occur on the next tick (has shown to slow down consecutive requests that result in cache misses).
ISSUE:
FireFerret is planned to support Node v10+
However, flatMap is not in Node10!!!
SOLUTION:
We're going to need to whip up a quick custom implementation that is Dubnium friendly.
https://tc39.es/ecma262/#sec-array.prototype.flatmap
Currently is destructive and doesn't use multi.
aka a piece of ๐ฉ
Use an improved cached for loop implementation.
https://github.com/kb-dev/sanic.js/blob/master/lib/array/reverse.js
To improve Redis (primarily memory) performance, it is suggested to use buckets to store documents. This will drastically reduce the amount of keys FireFerret requires to cache said documents.
PROPOSAL:
Buckets will be Redis hashes. Buckets will contain at most N documents, each mapped by ObjectID -> Raw Document. Buckets will have hash name bucketPrefix:firstID;lastID
with bucketPrefix = ff:DB_NAME::COLLECTION_NAME:bucket
. First / last ID values will be stringified document ObjectIDs from the respective documents.
A SCAN operation on bucketPrefix
will yield all bucket hash names for the provided db and collection. From there, a cache-hit would require filtering bucket hash names, fetching (and slicing) the documents from their buckets, then formatting the data according to user spec.
Failure to find ANY (> 1) matching bucket will result in a cache-miss. Documents will be looked up from mongo as per usual, and then filtered into buckets
DESIGN:
to-do
Update algorithm to order query match choices in a more preferable manner, i.e. the strictest ranges first.
HERE: https://github.com/mster/fireferret/blob/master/lib/utils/page.js#L54
When making consecutive fetch
or fetchById
calls any requests that require caching slow down the following call.
Properly queue events to the event loop.
The utility functions toFlatmap
and fromFlatmap
handle the storage and retrieval of arrays improperly.
Example of current return:
// before
regions: ['Canada', 'UK']
// after
regions: { '0': 'Canada', '1': 'UK' }
This needs to be resolved before v1.0.0
Use an internal cache for previously cached queries instead of storing them in Redis.
We can drastically speed up batch calls with sliced!
Using single key for pagination ranges, globbing together ranges into a single list in Redis.
Delete old key upon update.
Would this add time to the normal operation?
let i = 0; const iMax = array.length; for(; i < iMax; i++) {}
From: https://medium.com/kbdev/voyage-to-the-most-efficient-loop-in-nodejs-and-a-bit-js-5961d4524c2e
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.