mster / fireferret Goto Github PK

View Code? Open in Web Editor NEW

3.0 1.0 0.0 1.22 MB

Node.js Read-through cache for MongoDB 🔥🙈

Home Page: https://mster.github.io/fireferret/

License: MIT License

JavaScript 100.00%

mongodb redis nodejs cache client async redis-json pagination streaming ndjson

fireferret's People

Contributors

Stargazers

Watchers

fireferret's Issues

Bug: Redis timeout on large multi op

Large batch hash set crashes Redis client.

RedisError
    at RedisClient.multihset (/home/m/code/node/fireferret/lib/redis.js:151:13)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)

Improvement: Scheduled work manager

PROPOSAL:
We schedule caching events (during a CACHE_MISS) to be done after the FireFerret client returns the original copy from MongoDB.

Nothing is free, so the next tick takes on the cache event work on top of normal request work. It's possible to track pending cache requests and prevent redundant cache misses on the same query by freezing any consecutive requests to the same resource while caching. We could even hold onto the data locally, and return early with our local copy.

Is a few milliseconds per cache miss worth the memory overhead? 🤔

Bug: Pagination out of bounds

Example Error

{"level":50,"time":1596248117731,"pid":16023,"hostname":"ubuntu","reqId":11,"req":{"method":"GET","url":"/api/plants?q={%22plantName%22:%22Derek%22}&pagination={%22page%22:1000,%22size%22:50}","hostname":"127.0.0.1:3371","remoteAddress":"127.0.0.1","remotePort":40798},"res":{"statusCode":500},"err":{"type":"FFError","message":"Keys are required for multihgetall","stack":"InvalidArguments: Keys are required for multihgetall\n    at RedisClient.multihgetall (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/redis.js:111:13)\n    at cacheHit (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/client.js:86:39)\n    at FireFerret.fetch (/home/m/code/planterdocs/fetch-api/node_modules/fireferret/lib/client.js:125:12)\n    at processTicksAndRejections (internal/process/task_queues.js:97:5)\n    at async Object.module.exports.getPlants (/home/m/code/planterdocs/fetch-api/src/handlers/plants.js:32:20)","name":"InvalidArguments"},"msg":"Keys are required for multihgetall"}

Bug: Regex queries invalid for wideMatch

Whenever a query contains:

\w

The following scan operation done during a wideMatch attempt will not include said query.

We need every valid query to show up during a scan. The current guess is that the stringified escape characters mess with Redis' scan.

Feat: MongoDB query pagination

Currently pulling all query matched docs, regardless of pagination props.

We want to only fetch the page range's documents:
Implement pagination for the Mongo client, using either a bucket pattern or id sort.

AVOID USING SKIP-LIMIT IMPLEMENTATION (so heckin' slow w/ large data sets)

Feat: Batch operations

Support batch size for subdivision of Redis tasks.

Related to #1

Bug: Pagination deconstruction failure

Example Error:

{"level":50,"time":1596248052800,"pid":16023,"hostname":"ubuntu","reqId":7,"req":{"method":"GET","url":"/api/plants?q={plantName:%22Derek%22}&pagination={%22page%22:1,%22size%22:50}","hostname":"127.0.0.1:3371","remoteAddress":"127.0.0.1","remotePort":40790},"res":{"statusCode":500},"err":{"type":"SyntaxError","message":"Unexpected token p in JSON at position 1","stack":"SyntaxError: Unexpected token p in JSON at position 1\n    at JSON.parse (<anonymous>)\n    at Object.module.exports.getPlants (/home/m/code/planterdocs/fetch-api/src/handlers/plants.js:17:18)\n    at preHandlerCallback (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:120:28)\n    at preValidationCallback (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:103:5)\n    at handler (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:69:5)\n    at handleRequest (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/handleRequest.js:18:5)\n    at runPreParsing (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/route.js:358:5)\n    at Object.routeHandler [as handler] (/home/m/code/planterdocs/fetch-api/node_modules/fastify/lib/route.js:334:7)\n    at Router.lookup (/home/m/code/planterdocs/fetch-api/node_modules/find-my-way/index.js:351:14)\n    at Server.emit (events.js:311:20)"},"msg":"Unexpected token p in JSON at position 1"}

Refactor: Options

We currently have 6 different option-subsets. Condense and clarify before v1.0.0

Bug: Bucket pagination edge case

ISSUE:

{ pagination: { page: 1, size: 1 } }

Feat: Document Streaming

Support streaming documents (from Mongo or Redis) using Newline Delimited JSON.

Bug: empty query with empty options failure

Example Error:

Cannot destructure property 'page' of 'options.pagination' as it is undefined.

Feat: Retries

Users could benefit from allowing for automatic retries with timeout.

Feat: Back-fill null docs

PROBLEM:
We know a query has N results from the queryList. However, in some cases Redis document hash keys may expire, dropping the document from the cache.

To the user, this means they get < N documents back, even though they expected N.

PROPOSAL:
Allow users to enable:

Back-filling of the cache after a returning only what the cache contained at the time (Could be zero documents).
Force back-filling of data before returning to the user, and then filling the cache afterwards.

(1) is fast*, while (2) is expensive yet dependable.

(*) the work still exists, but will occur on the next tick (has shown to slow down consecutive requests that result in cache misses).

Compatibility: flatMap Node 10

ISSUE:
FireFerret is planned to support Node v10+

However, flatMap is not in Node10!!!

SOLUTION:
We're going to need to whip up a quick custom implementation that is Dubnium friendly.
https://tc39.es/ecma262/#sec-array.prototype.flatmap

Refactor: batchlpush

Currently is ~~destructive~~ and doesn't use multi.

aka a piece of 💩

Improvement: fast Array.reverse()

Use an improved cached for loop implementation.

https://github.com/kb-dev/sanic.js/blob/master/lib/array/reverse.js

Improvement: Document buckets

To improve Redis (primarily memory) performance, it is suggested to use buckets to store documents. This will drastically reduce the amount of keys FireFerret requires to cache said documents.

PROPOSAL:
Buckets will be Redis hashes. Buckets will contain at most N documents, each mapped by ObjectID -> Raw Document. Buckets will have hash name bucketPrefix:firstID;lastID with bucketPrefix = ff:DB_NAME::COLLECTION_NAME:bucket. First / last ID values will be stringified document ObjectIDs from the respective documents.

A SCAN operation on bucketPrefix will yield all bucket hash names for the provided db and collection. From there, a cache-hit would require filtering bucket hash names, fetching (and slicing) the documents from their buckets, then formatting the data according to user spec.

Failure to find ANY (> 1) matching bucket will result in a cache-miss. Documents will be looked up from mongo as per usual, and then filtered into buckets

DESIGN:
to-do

// before
regions: ['Canada', 'UK']

// after
regions: { '0': 'Canada', '1': 'UK' }

This needs to be resolved before v1.0.0

Would this add time to the normal operation?

Improvement: Use cached length for loops

let i = 0; const iMax = array.length; for(; i < iMax; i++) {}

From: https://medium.com/kbdev/voyage-to-the-most-efficient-loop-in-nodejs-and-a-bit-js-5961d4524c2e

mster / fireferret Goto Github PK

fireferret's People

Contributors

Stargazers

Watchers

fireferret's Issues

Recommend Projects

Recommend Topics

Recommend Org