Giter Site home page Giter Site logo

fast-redact's Introduction

fast-redact

very fast object redaction

Build Status

Default Usage

By default, fast-redact serializes an object with JSON.stringify, censoring any data at paths specified:

const fastRedact = require('fast-redact')
const fauxRequest = {
  headers: {
    host: 'http://example.com',
    cookie: `oh oh we don't want this exposed in logs in etc.`,
    referer: `if we're cool maybe we'll even redact this`,
    // Note: headers often contain hyphens and require bracket notation
    'X-Forwarded-For': `192.168.0.1`
  }
}
const redact = fastRedact({
  paths: ['headers.cookie', 'headers.referer', 'headers["X-Forwarded-For"]']
})

console.log(redact(fauxRequest))
// {"headers":{"host":"http://example.com","cookie":"[REDACTED]","referer":"[REDACTED]","X-Forwarded-For": "[REDACTED]"}}

API

require('fast-redact')({paths, censor, serialize}) => Function

When called without any options, or with a zero length paths array, fast-redact will return JSON.stringify or the serialize option, if set.

pathsArray

An array of strings describing the nested location of a key in an object.

The syntax follows that of the EcmaScript specification, that is any JavaScript path is accepted – both bracket and dot notation is supported. For instance in each of the following cases, the c property will be redacted: a.b.c,a['b'].c, a["b"].c, a[``b``].c. Since bracket notation is supported, array indices are also supported a[0].b would redact the b key in the first object of the a array.

Leading brackets are also allowed, for instance ["a"].b.c will work.

Wildcards

In addition to static paths, asterisk wildcards are also supported.

When an asterisk is place in the final position it will redact all keys within the parent object. For instance a.b.* will redact all keys in the b object. Similarly for arrays a.b[*] will redact all elements of an array (in truth it actually doesn't matter whether b is in an object or array in either case, both notation styles will work).

When an asterisk is in an intermediate or first position, the paths following the asterisk will be redacted for every object within the parent.

For example:

const fastRedact = require('fast-redact')
const redact = fastRedact({paths: ['*.c.d']})
const obj = {
  x: {c: {d: 'hide me', e: 'leave me be'}},
  y: {c: {d: 'and me', f: 'I want to live'}},
  z: {c: {d: 'and also I', g: 'I want to run in a stream'}}
}
console.log(redact(obj)) 
// {"x":{"c":{"d":"[REDACTED]","e":"leave me be"}},"y":{"c":{"d":"[REDACTED]","f":"I want to live"}},"z":{"c":{"d":"[REDACTED]","g":"I want to run in a stream"}}}

Another example with a nested array:

const fastRedact = require('..')
const redact = fastRedact({paths: ['a[*].c.d']})
const obj = {
  a: [
    {c: {d: 'hide me', e: 'leave me be'}},
    {c: {d: 'and me', f: 'I want to live'}},
    {c: {d: 'and also I', g: 'I want to run in a stream'}}
  ]
}
console.log(redact(obj)) 
// {"a":[{"c":{"d":"[REDACTED]","e":"leave me be"}},{"c":{"d":"[REDACTED]","f":"I want to live"}},{"c":{"d":"[REDACTED]","g":"I want to run in a stream"}}]}

remove - Boolean - [false]

The remove option, when set to true will cause keys to be removed from the serialized output.

Since the implementation exploits the fact that undefined keys are ignored by JSON.stringify the remove option may only be used when JSON.stringify is the serializer (this is the default) – otherwise fast-redact will throw.

If supplying a custom serializer that has the same behavior (removing keys with undefined values), this restriction can be bypassed by explicitly setting the censor to undefined.

censor – <Any type>('[REDACTED]')

This is the value which overwrites redacted properties.

Setting censor to undefined will cause properties to removed as long as this is the behavior of the serializer – which defaults to JSON.stringify, which does remove undefined properties.

Setting censor to a function will cause fast-redact to invoke it with the original value. The output of the censor function sets the redacted value. Please note that asynchronous functions are not supported.

serialize – Function | Boolean(JSON.stringify)

The serialize option may either be a function or a boolean. If a function is supplied, this will be used to serialize the redacted object. It's important to understand that for performance reasons fast-redact mutates the original object, then serializes, then restores the original values. So the object passed to the serializer is the exact same object passed to the redacting function.

The serialize option as a function example:

const fastRedact = require('fast-redact')
const redact = fastRedact({
  paths: ['a'], 
  serialize: (o) => JSON.stringify(o, 0, 2)
})
console.log(redact({a: 1, b: 2}))
// {
//   "a": "[REDACTED]",
//   "b": 2
// }

For advanced usage the serialize option can be set to false. When serialize is set to false, instead of the serialized object, the output of the redactor function will be the mutated object itself (this is the exact same as the object passed in). In addition a restore method is supplied on the redactor function allowing the redacted keys to be restored with the original data.

const fastRedact = require('fast-redact')
const redact = fastRedact({
  paths: ['a'], 
  serialize: false
})
const o = {a: 1, b: 2}
console.log(redact(o) === o) // true
console.log(o) // { a: '[REDACTED]', b: 2 }
console.log(redact.restore(o) === o) // true
console.log(o) // { a: 1, b: 2 }

strict – Boolean - [true]

The strict option, when set to true, will cause the redactor function to throw if instead of an object it finds a primitive. When strict is set to false, the redactor function will treat the primitive value as having already been redacted, and return it serialized (with JSON.stringify or the user's custom serialize function), or as-is if the serialize option was set to false.

Approach

In order to achieve lowest cost/highest performance redaction fast-redact creates and compiles a function (using the Function constructor) on initialization. It's important to distinguish this from the dangers of a runtime eval, no user input is involved in creating the string that compiles into the function. This is as safe as writing code normally and having it compiled by V8 in the usual way.

Thanks to changes in V8 in recent years, state can be injected into compiled functions using bind at very low cost (whereas bind used to be expensive, and getting state into a compiled function by any means was difficult without a performance penalty).

For static paths, this function simply checks that the path exists and then overwrites with the censor. Wildcard paths are processed with normal functions that iterate over the object redacting values as necessary.

It's important to note, that the original object is mutated – for performance reasons a copy is not made. See rfdc (Really Fast Deep Clone) for the fastest known way to clone – it's not nearly close enough in speed to editing the original object, serializing and then restoring values.

A restore function is also created and compiled to put the original state back on to the object after redaction. This means that in the default usage case, the operation is essentially atomic - the object is mutated, serialized and restored internally which avoids any state management issues.

Caveat

As mentioned in approach, the paths array input is dynamically compiled into a function at initialization time. While the paths array is vigourously tested for any developer errors, it's strongly recommended against allowing user input to directly supply any paths to redact. It can't be guaranteed that allowing user input for paths couldn't feasibly expose an attack vector.

Benchmarks

The fastest known predecessor to fast-redact is the non-generic pino-noir library (which was also written by myself).

In the direct calling case, fast-redact is ~30x faster than pino-noir, however a more realistic comparison is overhead on JSON.stringify.

For a static redaction case (no wildcards) pino-noir adds ~25% overhead on top of JSON.stringify whereas fast-redact adds ~1% overhead.

In the basic last-position wildcard case,fast-redact is ~12% faster than pino-noir.

The pino-noir module does not support intermediate wildcards, but fast-redact does, the cost of an intermediate wildcard that results in two keys over two nested objects being redacted is about 25% overhead on JSON.stringify. The cost of an intermediate wildcard that results in four keys across two objects being redacted is about 55% overhead on JSON.stringify and ~50% more expensive that explicitly declaring the keys.

npm run bench 
benchNoirV2*500: 59.108ms
benchFastRedact*500: 2.483ms
benchFastRedactRestore*500: 10.904ms
benchNoirV2Wild*500: 91.399ms
benchFastRedactWild*500: 21.200ms
benchFastRedactWildRestore*500: 27.304ms
benchFastRedactIntermediateWild*500: 92.304ms
benchFastRedactIntermediateWildRestore*500: 107.047ms
benchJSONStringify*500: 210.573ms
benchNoirV2Serialize*500: 281.148ms
benchFastRedactSerialize*500: 215.845ms
benchNoirV2WildSerialize*500: 281.168ms
benchFastRedactWildSerialize*500: 247.140ms
benchFastRedactIntermediateWildSerialize*500: 333.722ms
benchFastRedactIntermediateWildMatchWildOutcomeSerialize*500: 463.667ms
benchFastRedactStaticMatchWildOutcomeSerialize*500: 239.293ms

Tests

npm test  
  224 passing (499.544ms)

Coverage

npm run cov 
-----------------|----------|----------|----------|----------|-------------------|
File             |  % Stmts | % Branch |  % Funcs |  % Lines | Uncovered Line #s |
-----------------|----------|----------|----------|----------|-------------------|
All files        |      100 |      100 |      100 |      100 |                   |
 fast-redact     |      100 |      100 |      100 |      100 |                   |
  index.js       |      100 |      100 |      100 |      100 |                   |
 fast-redact/lib |      100 |      100 |      100 |      100 |                   |
  modifiers.js   |      100 |      100 |      100 |      100 |                   |
  parse.js       |      100 |      100 |      100 |      100 |                   |
  redactor.js    |      100 |      100 |      100 |      100 |                   |
  restorer.js    |      100 |      100 |      100 |      100 |                   |
  rx.js          |      100 |      100 |      100 |      100 |                   |
  state.js       |      100 |      100 |      100 |      100 |                   |
  validator.js   |      100 |      100 |      100 |      100 |                   |
-----------------|----------|----------|----------|----------|-------------------|

License

MIT

Acknowledgements

Sponsored by nearForm

fast-redact's People

Contributors

72636c avatar davidmarkclements avatar dncrews avatar ethanresnick avatar fdawgs avatar feugy avatar gurjotkaur20 avatar jsumners avatar karanssj4 avatar kostya-luxuryescape avatar lrecknagel avatar lukehedger avatar matt-clarson avatar mcollina avatar meirkl avatar mikehw avatar msala avatar n4zukker avatar roggervalf avatar th3hunt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

fast-redact's Issues

Cannot set property '<filed_name>' of undefined

I'm passing a path in the paths array of fast-redact instance options, that is not present in the object which is being passed for the redaction.

const fastRedact = require('fast-redact');

const fauxRequest = {
  headers: {
    host: 'http://example.com',
    cookie: `oh oh we don't want this exposed in logs in etc.`,
    referer: `if we're cool maybe we'll even redact this`,
  },
};

const redact = fastRedact({
  paths: ['referer', 'headers.cookie', 'body.secret'],
  censor: '***REDACTED***',
});

const redactResult = redact(fauxRequest);

console.log(redactResult);

When I run the above snippet, I'm getting the below error:

undefined:10
      o.body.secret = secret["body.secret"].val
                    ^

TypeError: Cannot set property 'secret' of undefined
    at Object.eval (eval at compileRestore (/Users/harishkumarmatta/Desktop/Incubator/code-logs/node_modules/fast-redact/lib/restorer.js:16:20), <anonymous>:10:21)
    at Object.eval (eval at redactor (/Users/harishkumarmatta/Desktop/Incubator/code-logs/node_modules/fast-redact/lib/redactor.js:9:18), <anonymous>:66:10)
    at Object.<anonymous> (/Users/harishkumarmatta/Desktop/Incubator/code-logs/redact.js:16:22)
    at Module._compile (internal/modules/cjs/loader.js:1158:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1178:10)
    at Module.load (internal/modules/cjs/loader.js:1002:32)
    at Function.Module._load (internal/modules/cjs/loader.js:901:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:74:12)
    at internal/main/run_main_module.js:18:47

I'm trying to have a single redaction instance, and passing all the redactable fields as paths. so that I can pass any object to the fast-redact instance.

Isn't it, how this module will work? I checked with Pino logging, it is working as expected. But when I try to use this module explicitly, I got the above-mentioned error.

Thank you.

Error is thrown for wildcard paths against object with nested null values

Hi,

Thanks a lot for the great lib.

I just wanted to raise this bug. An error is thrown in this edge case:

    const redact = fastRedact({
      paths: ['*.*.x'],
      serialize: false,
      censor: "[REDACTED]",
    });

    expect(redact({a: {b: null}})).toEqual({a: {b: null}});

Cannot use 'in' operator to search for 'x' in null
TypeError: Cannot use 'in' operator to search for 'x' in null
at specialSet (/workspace/lib-logger/node_modules/fast-redact/lib/modifiers.js:120:53)
at nestedRedact (/workspace/lib-logger/node_modules/fast-redact/lib/modifiers.js:73:7)

Code needs comments

As with #7, the whole codebase needs to be commented. It is very difficult to understand everything that is going on in this project. Without some explanations in the code, it is basically impossible for others to come along and contribute, e.g. #5.


With the non-descriptive variable names, this function is a mystery. A leading comment describing the parameters and what the function accomplishes would help:

function specialSet (o, k, p, v) {
var i = -1
var l = p.length
var li = l - 1
var n
var nv
var ov
var oov
var exists = true
ov = n = o[k]
if (typeof n !== 'object') return {value: null, parent: null, exists}
while (n != null && ++i < l) {
k = p[i]
nv = v
oov = ov
if (!(k in n)) {
exists = false
break
}
ov = n[k]
nv = (i !== li) ? ov : nv
n[k] = (has(n, k) && nv === ov) || (nv === undefined && v !== undefined) ? n[k] : nv
n = n[k]
if (typeof n !== 'object') break
}
return {value: ov, parent: oov, exists}
}

It is difficult to understand at first that this is building an object of compiled paths (if I even understand it?). A description of what is going on along with sample inputs and outputs would be very helpful:

fast-redact/lib/parse.js

Lines 10 to 42 in 3bc8dbe

const secret = paths.reduce(function (o, strPath, ix) {
var path = strPath.match(rx).map((p) => p.replace(/'|"|`/g, ''))
const leadingBracket = ix === 0 && strPath[0] === '['
path = path.map((p) => {
if (p[0] === '[') return p.substr(1, p.length - 2)
else return p
})
const star = path.indexOf('*')
if (star > -1) {
const before = path.slice(0, star)
const beforeStr = before.join('.')
const after = path.slice(star + 1, path.length)
if (after.indexOf('*') > -1) throw Error('fast-redact – Only one wildcard per path is supported')
const nested = after.length > 0
wcLen++
wildcards.push({
before,
beforeStr,
after,
nested
})
} else {
o[strPath] = {
path: path,
val: null,
precensored: false,
circle: '',
escPath: JSON.stringify(strPath),
leadingBracket: leadingBracket
}
}
return o
}, {})

Brief introductions to each function should be present:
https://github.com/davidmarkclements/fast-redact/blob/3bc8dbe3547542b22b44262067d7390d5efc37b8/lib/redactor.js

Explanations of when these functions get used and how:
https://github.com/davidmarkclements/fast-redact/blob/3bc8dbe3547542b22b44262067d7390d5efc37b8/lib/restorer.js

fast-redact seems to add significantly more than 1% overhead on top of JSON.stringify

It is mentioned in the Benchmarks section of the README file that:

In the direct calling case, fast-redact is ~30x faster than pino-noir, however a more realistic comparison is overhead on JSON.stringify.

For a static redaction case (no wildcards) pino-noir adds ~25% overhead on top of JSON.stringify whereas fast-redact adds ~1% overhead.

I attempted to measure the overhead firsthand. In this benchmark, fast-redact performed orders of magnitude slower than JSON.stringify (using the Default Usage example from the README file), which makes the overhead astronomical:

jsonStringify x 1,094,634 ops/sec ±0.74% (89 runs sampled)
fastRedactWithSerialization x 791 ops/sec ±5.98% (74 runs sampled)
fastRedactWithoutSerialization x 841 ops/sec ±4.94% (78 runs sampled)
Fastest is jsonStringify

The benchmarks were run on a MacBook Pro (15-inch, 2019) with the following specs:

  • Processor: 2.6 GHz 6-Core Intel Core i7
  • Memory: 16 GB 2400 MHz DDR4
  • Operating system: macOS Big Sur
  • Node.js version: 16.13.1 LTS

Could you please help me make sense of those results? You can find the benchmarking code here.

Thanks!

can't redact parameters that contains - character

Hi, I'm trying to use fast-redact to redact headers that comes from an api gateway request.
In this request there's an header containing the customer api key called x-api-key.
This code throws an exception in the validator

const params = req.headers
const redact = fastRedact({
        paths: ['x-api-key'],
        serialize: false
    });
    const redacted = redact(params);

Throws this exception

ReferenceError: api is not defined
  at evalmachine.<anonymous>:4:17
    at evalmachine.<anonymous>:6:13
    at Script.runInContext (node:vm:139:12)
    at runInContext (node:vm:289:6)
    at /Users/tiziano_faion/BizAway/RiskAssessment/node_modules/bizlogger/node_modules/fast-redact/lib/validator.js:24:9

`createContext is not a function` error thrown in validator.js

Hi, I'm trying to use this package in a React app to redact sensitive tokens before logging the information to a remote server and have run into a problem. I'd like some help!

The error in the console is:

validator.js:34 Uncaught Error: fast-redact – Invalid path (headers.cookie)
    at validator.js:34
    at Array.forEach (<anonymous>)
    at validate (validator.js:14)
    at fastRedact (index.js:39)

However when I step into the code and look inside validator.js, the actual exception thrown is createContext is not a function
image

Right now I'm simply using the code provided in the Readme, but using an import instead of a require

import fastRedact from "fast-redact";
const redact = fastRedact({
    paths: ["headers.cookie", "headers.referer"],
});
const fauxRequest = {
    headers: {
      host: 'http://example.com',
      cookie: `oh oh we don't want this exposed in logs in etc.`,
      referer: `if we're cool maybe we'll even redact this`
    }
  }
console.log("Redacted Payload", redact(fauxRequest));

Optional throw Error?

I'm using a redacted path like ['data.username']. fast-redact produces the following method signature for redaction:

(function(o
/*``*/) {

    if (typeof o !== 'object' || o == null) {
      throw Error('fast-redact: primitives cannot be redacted')
    }
    const { censor, secret } = this
    
      if (o.username != null) {
        const val = o.username
        if (val === censor) {
          secret["username"].precensored = true
        } else {
          secret["username"].val = val
          o.username = censor
          
      switch (true) {
        
      }
    
        }
      }
...

})

When using a library like axios, data can be either an object or a string when content type is application/json vs application/x-www-form-urlencoded. What are some thoughts about having a setting to just ignore redaction if an o is primitive vs throwing an error?

redaction is not working for wild cards after level 2

Issue: redaction is working upto level 2 when we use wildcards like ..* but after level 2, it is not working.

Here is the code where I want to redact field at level 3:
`const fastRedact = require('fast-redact')
const redact = fastRedact({
paths: [
'..entityPanNumber',
'..*.entityPanNumber' ],
censor: '*****'
})

console.log(redact({ "1": { "2": { "entityPanNumber": "DEF" } } }))
console.log(redact({ "1": { "2": { "3": { "entityPanNumber": "DEF" } } }}))`

The output is:
{"1":{"2":{"entityPanNumber":"*****"}}} {"1":{"2":{"3":{"entityPanNumber":"DEF"}}}}

Redact fails when both the parent and child nodes are redacted

The following redaction pattern works on v3.1.2 and v3.2.0 but not on v3.3.0:

'use strict';

const fastRedact = require('fast-redact');

const thing = {
  prop: {
    top: {
      secret: 'top secret',
    },
  },
};

const redact = fastRedact({
  paths: ['*.*.secret', '*.top'],
  censor: '**SECRET**',
});

console.log(redact(thing));

On v3.3.0 it fails with the following error:

/Users/user/project/node_modules/fast-redact/lib/modifiers.js:54
    current[path[0]] = value
                     ^

TypeError: Cannot create property 'secret' on string '**SECRET**'
    at Object.nestedRestore (/Users/user/project/node_modules/fast-redact/lib/modifiers.js:54:22)
    at Object.eval (eval at compileRestore (/Users/user/project/node_modules/fast-redact/lib/restorer.js:15:20), <anonymous>:12:17)
    at Object.eval (eval at redactor (/Users/user/project/node_modules/fast-redact/lib/redactor.js:9:18), <anonymous>:24:10)
    at Object.<anonymous> (/Users/user/project/test.js:18:13)

The conditions to generate this failure seem to be:

  • You are redacting a node (e.g top), but also one of it's child (e.g top.secret)
  • In the redaction paths, you specify the child redaction pattern before the parent redaction pattern (e.g ['*.*.secret', '*.top'] fails but ['*.top', '*.*.secret'] works)

Would appreciate some help to understand whether this is working as designed, a bug or known limitation.

Value of object in nested array gets overwritten to the array last index value

When providing path that has nested array, value of all redacted keys in the array gets overwritten to the value of the array's last index object value.

const fastRedact = require("fast-redact");
const redact = fastRedact({
  paths: ["a[*].b[*].c"],
});

const obj = {
  a: [{ b: [{ c: 1 }, { c: 2 }, { c: 3 }] }],
};

console.log(redact(obj));
console.log(inspect(obj, undefined, null));

Output

{"a":[{"b":[{"c":"[REDACTED]"},{"c":"[REDACTED]"},{"c":"[REDACTED]"}]}]}
{
  a: [
    { b: [ { c: 3 }, { c: 3 }, { c: 3 } ] }
  ]
}

Expected result of c should be retained as 1, 2, 3 respectively.

Use user's serializer with `strict: false`

Looking at #15, it seems like perhaps a better fix would've been:

function strictImpl (strict) {
  return strict === true 
    ? `throw Error('fast-redact: primitives cannot be redacted')` 
    : `return this.serialize(o)`;
}

As long as the serializer is the default JSON.stringify, I believe the behavior above would match the behavior in PR #15. But, if the user has customized the serialize function to return something other than JSON, the assumption driving #15 (that the result should always be JSON) seems invalid. It would be more consistent imo to make the redacted output always go through serialize, with strict being orthogonal.

Thoughts?

Redaction is mutating objects in v3.4.0

When upgrading to v3.4.0, I noticed that redaction was causing the actual objects being redacted to be mutated, such that values are redacted beyond the scope of the logging.

To reproduce:

import pino from "pino";

const p = pino({
  level: process.env.LOG_LEVEL || "info",
  redact: [
    "items[*].name",
  ],
});

const toRedact = { items: [{ name: "John" }] };
const logger = p.child({ property: "value" });

logger.info(toRedact);

console.log('Printing response', toRedact);

logger.info(toRedact);

console.log('Printing response', toRedact);

This will result in the final console.log() statement to print out Printing response { items: [ { name: '[Redacted]' } ] }, showing that the actual value of name has been changed beyond the scope of pino logging. This behavior is not present in v3.3.0.

Multi-level wildcards redact things that should not be redacted

Hello, i wrote test that shows what i mean,
so basically, if you provide deep enough sequence of wildcards, then all it needs is matching last key in object and its redacted. Even thou in paths you require also "the one before it" to match.

test("Test with multiple levels of wildcards", ({ end, is }) => {
  const censor = "censored";
  const value = "value";

  const paths = [
    "a.x",
    "a.y",
    "*.a.x",
    "*.a.y",

    // These break it
    "*.*.a.x",
    "*.*.a.y",

    // These wont do it
    // "*.*.a.x2",
    // "*.*.a.y2"
  ];

  const redact = fastRedact({ paths, censor, serialize: false });
  const o = {
    a: {
      x: value,
      y: value,
    },
    b: {
      x: value,
      y: value,
    },
  };

  redact(o);
  is(o.a.x, censor);
  is(o.a.y, censor);
  is(o.b.x, value);
  is(o.b.y, value);
  redact.restore(o);
  is(o.a.x, value);
  is(o.a.y, value);
  is(o.b.x, value);
  is(o.b.y, value);
  end();
});

Wildcard key redaction breaks on strings

fast-redact is used in Pino to support redaction of log fields.

For example, we may want to redact a path that may contain PII:

const fastRedact = require('fast-redact');

const redact = fastRedact({ paths: ['pii.*'] });

const goodLog = { msg: 'this is fine', pii: { name: 'Jean Doe' } };

redact(goodLog);
// => { msg: 'this is fine', pii: { name: '[REDACTED]' } }

However, logs can be unstructured. What if the pii field is a string?

const fastRedact = require('fast-redact');

const redact = fastRedact({ paths: ['pii.*'] });

const log = { msg: 'this is fine', pii: '😈' };

redact(log);
// => TypeError: Cannot assign to read only property '0' of string '😈'

I think the desired behaviour here would be for fast-redact to pass through the string untouched rather than throwing a runtime error.

Publish ESM & CJS version of library

Given Node 18 fully supports ESM, and there may be ESM projects that want to consume this library, I thought I would raise this issue.

Happy to raise a PR with a vague sketch of how this might be achieved.

Support ** for arbitrary depth

Generally I want to redact **.authorization to make sure I don't leak credentials into logging, for any rare error cases where a request object might be buried in a field of an exception.

Multi-wildcard implementation broke paths passed to censor function

Hi! I love the new support for multiple wildcards in redaction paths. Unfortunately, when the PR for that feature was merged, it looks like it introduced some weird behavior.

As shown in this test and the one right below it, when a wildcard is used in a redaction path, and that path is exposed to the censor function, the prior contract was for the censor function to see a path based on what the wildcard actually matched. E.g., the path '*.b' would be passed to the censor function as ['a', 'b'] if the wildcard matched a key named 'a'.

With multiple wildcards, though, as shown in this test, the first wildcard in the path is exposed to the censor function using the actual key it matched, but subsequent wildcards are passed in as a literal '*'.

I implemented the original support for passing a path to the censor function, so I'd normally be open to fixing this ticket, but I really have no time right now. I've also completely forgotten everything figured out about specialSet, and generally how all the pieces of this library fit together.

If @lukehedger is able to fix this, that would be amazing, since he's obviously touched the code more recently and wrote the multi-wildcard code. But, if he doesn't have time either, I figured I'd still open this issue just to document the problem.

Support multiple wildcards

Because only one wildcard per path is supported, there is no way to redact anything inside a doubly nested array.

E.g. in the following object, you can redact anything underneath tokens unless you redact everything underneath tokens:

{ creditcards: [ { amount: 0, tokens: [ { tokenValue: 'string', tokenType: 'string', cardType: 'string', expDate: 'string', cardHolderName: 'string', lastFour: 'string', }, ], }, ], }

Unless I am missing something. Would be great to support.

Not redacting objects that have a toJSON function

When objects have a toJSON function redacting is not working:

const fastRedact = require('fast-redact');

const data = {
  a: 'asd',
  toJSON: () => ({
    a: 'well',
  }),
};

const s = fastRedact({
  paths: ['a'],
});

console.log(s(data));

Prints:

{"a":"well"}

Should restore properties matching multiple paths

The issue

Each path type (static, group, nested) uses an isolated way
of storing and restoring values on the original object.

In case of a property that matches multiple redaction paths,
this isolation leads to loss of information as it fails to restore the
original value on the original mutated object.

The issue can be reproduced with the following:

const redact = fastRedact({
  paths: ['a', '*.b', 'x.b'],
  serialize: false
})
const o = {
  x: {
    a: 'a',
    b: 'b'
  }
}
redact(o)

In this example x.b would match both the static x.b
path and the wildcard *.b path leading in the original
object being restored as:

{
  x: {
    a: 'a',
    b: '[REDACTED]'
  }
}

I prepared #34 that resolves this issue.
Haven't checked the performance impact yet but will do asap.

Support for deep-wildcard object redaction

Tried to set-up using fast-redact for redacting PII from logs. However the logs are unstructured and it's non-obvious the package doesn't support wildcarding depths in object paths rather than widths at specific depths.
e.g.

import fastRedact from 'fast-redact';

describe('fastRedact', () => {
  test('deep wildcard redaction', () => {
    const redact = fastRedact({ paths: ['*.firstName'] });
    const obj = {
      x: { firstName: 'redactme' },
      y: { a: { firstName: 'redactme' } },
      z: { c: { h: { firstName: 'redactme' } } },
    };

    expect(redact(obj)).toStrictEqual(
      JSON.stringify({
        x: { firstName: '[REDACTED]' },
        y: { a: { firstName: '[REDACTED]' } }, // is 'redactme
        z: { c: { h: { firstName: '[REDACTED]' } } }, // 'redactme'
      }),
    );
  });
});

I can get the above to work if I add paths: ['*.firstName', '*.*.firstName', '*.*.*.firstName'] but as these are unstructured logs, I'll never know the depth of the object beforehand

Have I missed something in the documentation to enable the depth wildcard traversal?

`rx` needs comments

module.exports = /[^.[\]]+|\[(?:(-?\d+(?:\.\d+)?)((?:(?!\2)[^\\]|\\.)*?)\2)\]|(?=(\.|\[\])(?:\4|$))$/g

That needs to be thoroughly commented. My eyes cross and I black out for some indeterminate time when I see it.

Question: Redact dot notation paths

Hello, I want to redact mongo queries in my logs, eg

query: {
   'my.secret.token': 'cndqoncwqon',
}

I tried paths like paths: ["query['my.secret.token']"] but the validator complains

undefined:9
       if (o.query && o.query['my && o.query['my.secret && o.query['my.secret.token'] != null) {
                                              ^^
 
 SyntaxError: Unexpected identifier

any ideas ?
thanks in advance

Original object is modified and not restored in any way

I want to be able the redact a key in multiple levels. So i wrote this code:

const fastRedact = require('fast-redact');
const paths = ['*.d', '*.*.d', '*.*.*.d'];

const redact = fastRedact({
  paths,
});

const obj = {
  x: { c: { d: 'hide me', e: 'leave me be' } },
  y: { c: { d: 'and me', f: 'I want to live' } },
  z: { c: { d: 'and also I', g: 'I want to run in a stream' } },
};

console.log(redact(obj));
console.log(obj);

The output is:

{"x":{"c":{"d":"[REDACTED]","e":"leave me be"}},"y":{"c":{"d":"[REDACTED]","f":"I want to live"}},"z":{"c":{"d":"[REDACTED]","g":"I want to run in a stream"}}}
{
  x: { c: { d: '[REDACTED]', e: 'leave me be' } },
  y: { c: { d: '[REDACTED]', f: 'I want to live' } },
  z: { c: { d: '[REDACTED]', g: 'I want to run in a stream' } }
}

You can see that the original object is modified.

Seems like something is wrong with the nestedRestore and the nestedRedact functions.

When i added multiple levels that do not exist in the object. Seems like that nestedRedact with the specialSet functions doesn't build the store properly.

Documentation and Case Sensitivity

One of the things that threw me for a loop while giving the redact feature a spin in pino was the fact that it's not documented whether or not the redaction paths are case sensitive.

It turns out that they are. That makes it extremely burdensome to construct a comprehensive list for redacting logs across a wide organization. Consider socialsecuritynumber and the variations possible:

socialsecuritynumber
socialSecuritynumber
socialsecurityNumber
socialSecurityNumber

The argument here might be "well this is a code quality issue that should be caught or prevented," and that would be well justified. But across large orgs with varying teams, varying levels of skill, oversight, and caring, asserting code quality isn't a reasonable assertion.

First and foremost, the documentation needs to be updated to reflect the current state - paths are case-sensitive.

I'd love to see some consideration given to allowing for the search to be case-insensitive.

Potential memory leak when using wildcard patterns

We've been using fast-redactto redact PII data from an event stream we get from a partner. These events are high volume and quite large (on the order of a few kilobytes). This service has been experiencing memory issues that (I think) I've traced back to fast-redact. Here is a minimum reproducible example:

import fastRedact from 'fast-redact';

const redactPaths = [ 'a', '*.a' ]

const event: Record<string, unknown> = {
  id: 1,
  a: {
    id: 2,
  },
};

const eventString = JSON.stringify(event)

const fr = fastRedact({
  paths: redactPaths,
  serialize: false,
});


fr({
  // MUST use a new object each time!
  payload: JSON.parse(eventString),
})
fr({
  // MUST use a new object each time!
  payload: JSON.parse(eventString),
})
fr({
  // MUST use a new object each time!
  payload: JSON.parse(eventString),
})
fr({
  // MUST use a new object each time!
  payload: JSON.parse(eventString),
})

Every call to the fr function accumulates a copy of the input object inside the this.secret object in the compiled Function against a key of '', the empty string. You can observe this by applying the following diff to this library at tag v3.3.0:

diff --git a/lib/redactor.js b/lib/redactor.js
index af58885..8f7d163 100644
--- a/lib/redactor.js
+++ b/lib/redactor.js
@@ -10,6 +10,7 @@ function redactor ({ secret, serialize, wcLen, strict, isCensorFct, censorFctTak
     if (typeof o !== 'object' || o == null) {
       ${strictImpl(strict, serialize)}
     }
+    console.log(["In generated function", this.secret['']])
     const { censor, secret } = this
     ${redactTmpl(secret, isCensorFct, censorFctTakesPath)}
     this.compileRestore()

Crudely, the above just dumps the contents of the this.secret[''] value, but it's enough to demonstrate that this array is appended to for every call to fr. Running the above reproducible example with the diff applied to the codebase yields the output:

[ 'In generated function', undefined ]
[
  'In generated function',
  [ { path: [Array], value: [Object], target: [Object] } ]
]
[
  'In generated function',
  [
    { path: [Array], value: [Object], target: [Object] },
    { path: [Array], value: [Object], target: [Object] }
  ]
]
[
  'In generated function',
  [
    { path: [Array], value: [Object], target: [Object] },
    { path: [Array], value: [Object], target: [Object] },
    { path: [Array], value: [Object], target: [Object] }
  ]
]

It's not clear what the fix is here, is this the fault of the wildcard parser for having a beforeStr value of the empty-string? Is it the fault of the compiled function that uses the wildcards? Or is it an issue with the specialSet function that actually appends to the array?

How to specify path containing lower level '-' character

If I have a structure:

{
  headers: {
    host: "xxx",
    "user-agent": "xxx",
    "x-authorization": "xxx"
  }
}

How do I specify a path that redacts headers.x-authorization ?

Here's a test script I was using for something similar:

import * as F from 'fast-redact';
import { inspect } from 'util';

describe('Redact', () => {
    it('can redact', () => {
        const options: F.RedactOptions = {
            paths: ['authorization', 'abc."x-authorization"'],
            serialize: false
        };
        const original = {
            innocuous: 'public info',
            authorization: 'secret info',
            abc: {
                'x-authorization': 'another secret'
            }
        };
        const r = F(options);
        r(original);
        console.log(inspect(original, { depth: null }));
        expect(original.authorization).toEqual('[REDACTED]');
    });
});

I get:

    fast-redact – Invalid path (abc."x-authorization")

      15 |             }
      16 |         };
    > 17 |         const r = F(options);
         |                   ^

I tried other options - e.g. 'abc.x-authorization', but I couldn't come up with anything that worked.

"TypeError: Cannot set property 'xxx' of undefined" with null value

A TypeError: Cannot set property 'xxx' of undefined happens in fact-redact when a redact path starts with a wildcard and the object being redacted contains a property with a null value.

Here's a way to reproduce the error. "fr.js" contains this code:

const fastRedact = require('fast-redact')
const fauxRequest = {
  foo: null
}

const redact = fastRedact({ paths: ['*.password']})

console.log(redact(fauxRequest))

and this is what happens when the code is run:

$ node fr.js
.../node_modules/fast-redact/lib/modifiers.js:37
    target[key] = value
                ^

TypeError: Cannot set property 'password' of undefined
    at Object.nestedRestore (.../node_modules/fast-redact/lib/modifiers.js:37:17)
    at Object.eval (eval at compileRestore (.../node_modules/fast-redact/lib/restorer.js:16:20), <anonymous>:13:17)
    at Object.eval (eval at redactor (.../node_modules/fast-redact/lib/redactor.js:9:18), <anonymous>:24:10)
    at Object.<anonymous> (.../fr.js:8:13)
    at Module._compile (internal/modules/cjs/loader.js:654:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:665:10)
    at Module.load (internal/modules/cjs/loader.js:566:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:506:12)
    at Function.Module._load (internal/modules/cjs/loader.js:498:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:695:10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.