Giter Site home page Giter Site logo

nodesecure / js-x-ray Goto Github PK

View Code? Open in Web Editor NEW
219.0 4.0 26.0 1.15 MB

JavaScript & Node.js open-source SAST scanner. A static analyser for detecting most common malicious patterns πŸ”¬.

License: MIT License

JavaScript 100.00%
security security-tools security-audit ast-analysis ast javascript nodejs sast supply-chain-security

js-x-ray's Introduction

@nodesecure/js-x-ray

npm version license ossf scorecard github ci workflow codecov

JavaScript AST analysis. This package has been created to export the NodeSecure AST Analysis to enable better code evolution and allow better access to developers and researchers.

The goal is to quickly identify dangerous code and patterns for developers and Security researchers. Interpreting the results of this tool will still require you to have a set of security notions.

Goals

The objective of the project is to successfully detect all potentially suspicious JavaScript codes.. The target is obviously codes that are added or injected for malicious purposes..

Most of the time these hackers will try to hide the behaviour of their codes as much as possible to avoid being spotted or easily understood... The work of the library is to understand and analyze these patterns that will allow us to detect malicious code..

Features Highlight

  • Retrieve required dependencies and files for Node.js.
  • Detect unsafe RegEx.
  • Get warnings when the AST Analysis as a problem or when not able to follow a statement.
  • Highlight common attack patterns and API usages.
  • Capable to follow the usage of dangerous Node.js globals.
  • Detect obfuscated code and when possible the tool that has been used.

Getting Started

This package is available in the Node Package Repository and can be easily installed with npm or yarn.

$ npm i @nodesecure/js-x-ray
# or
$ yarn add @nodesecure/js-x-ray

Usage example

Create a local .js file with the following content:

try  {
    require("http");
}
catch (err) {
    // do nothing
}
const lib = "crypto";
require(lib);
require("util");
require(Buffer.from("6673", "hex").toString());

Then use js-x-ray to run an analysis of the JavaScript code:

import { AstAnalyser } from "@nodesecure/js-x-ray";
import { readFileSync } from "node:fs";

const scanner = new AstAnalyser();

const { warnings, dependencies } = await scanner.analyseFile(
  "./file.js"
);

console.log(dependencies);
console.dir(warnings, { depth: null });

The analysis will return: http (in try), crypto, util and fs.

Tip

There is also a lot of suspicious code example in the ./examples cases directory. Feel free to try the tool on these files.

API

Warnings

This section describes how use warnings export.

type WarningName = "parsing-error"
| "encoded-literal"
| "unsafe-regex"
| "unsafe-stmt"
| "short-identifiers"
| "suspicious-literal"
| "suspicious-file"
| "obfuscated-code"
| "weak-crypto"
| "unsafe-import"
| "shady-link";

declare const warnings: Record<WarningName, {
  i18n: string;
  severity: "Information" | "Warning" | "Critical";
  experimental?: boolean;
}>;

We make a call to i18n through the package NodeSecure/i18n to get the translation.

import * as jsxray from "@nodesecure/js-x-ray";
import * as i18n from "@nodesecure/i18n";

console.log(i18n.getTokenSync(jsxray.warnings["parsing-error"].i18n));

Legends

This section describe all the possible warnings returned by JSXRay. Click on the warning name for additional information and examples.

name experimental description
parsing-error ❌ The AST parser throw an error
unsafe-import ❌ Unable to follow an import (require, require.resolve) statement/expr.
unsafe-regex ❌ A RegEx as been detected as unsafe and may be used for a ReDoS Attack.
unsafe-stmt ❌ Usage of dangerous statement like eval() or Function("").
encoded-literal ❌ An encoded literal has been detected (it can be an hexa value, unicode sequence or a base64 string)
short-identifiers ❌ This mean that all identifiers has an average length below 1.5.
suspicious-literal ❌ A suspicious literal has been found in the source code.
suspicious-file ❌ A suspicious file with more than ten encoded-literal in it
obfuscated-code βœ”οΈ There's a very high probability that the code is obfuscated.
weak-crypto ❌ The code probably contains a weak crypto algorithm (md5, sha1...)
shady-link ❌ The code contains shady/unsafe link

Workspaces

Click on one of the links to access the documentation of the workspace:

name package and link
estree-ast-utils @nodesecure/estree-ast-utils
sec-literal @nodesecure/sec-literal
ts-source-parser @nodesecure/ts-source-parser

These packages are available in the Node Package Repository and can be easily installed with npm or yarn.

$ npm i @nodesecure/estree-ast-util
# or
$ yarn add @nodesecure/estree-ast-util

Contributors ✨

All Contributors

Thanks goes to these wonderful people (emoji key):

Gentilhomme
Gentilhomme

πŸ’» πŸ“– πŸ‘€ πŸ›‘οΈ πŸ›
Nicolas Hallaert
Nicolas Hallaert

πŸ“–
Antoine
Antoine

πŸ’»
Mathieu
Mathieu

πŸ’»
Vincent Dhennin
Vincent Dhennin

πŸ’» ⚠️
Tony Gorez
Tony Gorez

πŸ’» πŸ“– ⚠️
PierreD
PierreD

⚠️ πŸ’»
Franck Hallaert
Franck Hallaert

πŸ’»
Maji
Maji

πŸ’»
MichaΓ«l Zasso
MichaΓ«l Zasso

πŸ’» πŸ›
Kouadio Fabrice Nguessan
Kouadio Fabrice Nguessan

🚧 πŸ’»
Jean
Jean

⚠️ πŸ’» πŸ“–
tchapacan
tchapacan

πŸ’» ⚠️
mkarkkainen
mkarkkainen

πŸ’»
FredGuiou
FredGuiou

πŸ“– πŸ’»
Madina
Madina

πŸ’»
SairussDev
SairussDev

πŸ’»
Abdou-Raouf ATARMLA
Abdou-Raouf ATARMLA

πŸ’»

License

MIT

js-x-ray's People

Contributors

aekk0 avatar allcontributors[bot] avatar antoine-coulon avatar dependabot[bot] avatar fabnguess avatar fless-lab avatar fraxken avatar fredguiou avatar halcin avatar jean-michelet avatar kawacrepe avatar m4gie avatar madina0801 avatar mathieuka avatar mkarkkainen avatar pierredemailly avatar rossb0b avatar sairuss7 avatar snyk-bot avatar step-security-bot avatar targos avatar tchapacan avatar tony-go avatar zxkmm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

js-x-ray's Issues

Detect and throw warning for weak crypto hash algorithm

The goal of this task (issue) is to develop a new feature capable of detecting any usage of weak hash algorithm like md5.

For the sake of simplicity it is sufficient to look for the createHash method.

Example of code that should throw a new warning:

import crypto from "crypto";

crypto.createHash("md5");

We may have to answer few questions for this issue:

  • Do we have to handle other API (like the WebCrypto API)? Maybe we can also in some ways deal with popular crypto library ?
  • Is there is another algorithms that we are considering "weak" other than md5 ? (i guess sha1 has to be considered weak too).

HTML Comment Parsing Error

Hey @NodeSecure!

We at @CybercentreCanada are loving all of the work that you've put into this tool!

A file that we came across recently in the wild has been causing JS-X-Ray to crash, unless we manually tweak the file content (CybercentreCanada/assemblyline-service-jsjaws#370). Obviously this is not ideal, and we would love to have the fix included in JS-X-Ray itself :)

I cannot share the entire file, but here is a screenshot of the initial HTML file:
image

and a screenshot of the extracted JavaScript that is sent to JS-X-Ray:
image

There is an opening HTML comment prior to the obfuscated code, and this is ignored when emulated in Node, but crashes the JS-X-Ray tool when the Meriyah library attempts to parse it. Here is the crash log:

file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:182
    throw new ParseError(parser.index, parser.line, parser.column, type, ...params);
          ^

ParseError [SyntaxError]: [1:4]: Unexpected token
    at report (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:182:11)
    at skipSingleHTMLComment (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:708:9)
    at scanSingleToken (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:1825:41)
    at nextToken (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:1736:20)
    at parseModuleItemList (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:4836:5)
    at parseSource (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:4789:16)
    at Module.parseScript (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/meriyah/dist/meriyah.esm.mjs:8821:12)
    at parseScriptExtended (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/@nodesecure/js-x-ray/index.js:92:30)
    at runASTAnalysis (file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/node_modules/@nodesecure/js-x-ray/index.js:29:16)
    at file:///home/<userpath></userpath>/assemblyline-service-jsjaws/tools/js-x-ray-run.js:17:22 {
  index: 4,
  line: 1,
  column: 4,
  description: '[1:4]: Unexpected token',
  loc: { line: 1, column: 4 }
}

When this opening comment is removed, JS-X-Ray works great, parses the file and identifies it as being obfuscated with Obfuscator.io.

Is there anyway that HTML comments () could be removed prior to parsing? We found that closing HTML comments (-->) also cause Meriyah to crash, so it would amazing if they could be ignored / removed once sent to JS-X-Ray.

Let me know what you think!

Kevin

Rework SourceFile analysis strategy

Right now JS-X-Ray is only capable to scan one sourcefile by one sourcefile. The Scanner package is currently responsible of listing and iterating all JavaScript files from a given NPM tarball (for example).

What's the issue with that?

index.js
src/
  other.js
test/
  foobar.js

In the example above we will scan every files. But in reality there is a high probability that test/foobar.js will never be executed (and it will also be the biggest vector of false positives).

My idea is to add a new strategy that will take entry files as input. We will then only scan files imported from these entry points.

Eventually, we could combine the two ways of doing things to ensure greater security while reducing false positives overall.

Add a severity property to warnings

As time goes by, I think that not all warnings are equivalent.

I want to add a new severity property for each warning:

  • Information (For things not related to security, with valid values or a lot of false positive).
  • Warning (the default).
  • Critical (Implies malicious code or intent).

Maybe the way we call them is not right ? Maybe we should call warning "Alarm" ?


I will classify the existing warnings this way:

  • Information: parsing-error, encoded-literal
  • Warning: ALL OTHERS
  • Critical: obfuscated-code

[WIP] Multiple-files / Multiple-steps analysis

The idea for the next Major release is to find a way to build a new Analysis workflow capable of storing informations about multiple files and then re-iterating a second time.

Few things that could help;

  • Design and implement an option to specificate when a probe should run (always, first iteration, others ?).
  • Focus to solve: #42

Add deprecation warning to runASTAnalysis and runASTAnalysisOnFile

Add a deprecation warning to our legacy API runASTAnalysis and runASTAnalysisOnFile to detect and force people still using them to upgrade. We could easily do that using Node.js process.emitWarning API.

An example message could be The runASTAnalysis API is deprecated and will be removed in v8. Please use the AstAnalyser class instead.

Roadmap (critical features/issues)

Here is a roadmap about my ideas about future releases of JS-X-Ray (outside of fixing current issues).

  • Add new documentation about Architectures
  • #267
  • Improve the way to add new Dependency/Warning (we probably need dedicated class with maybe Lazy probes). My idea is allow probes to be more independant from the SourceFile implementation.

Fix morse detection

Morse detection is currently broken (and the test in obfuscated.spec.js is commented)

Enhance exported Warnings CONSTANTS to include i18n code

Sub-issue related to: NodeSecure/ci#5 (please read it for initial context).

⚠️ This issue is blocked by: NodeSecure/i18n#17


The goal of this task is to enhance the current export of Warnings CONSTANTS.

js-x-ray/index.js

Lines 86 to 98 in f833d69

export const CONSTANTS = {
Warnings: Object.freeze({
parsingError: "ast-error",
unsafeImport: "unsafe-import",
unsafeRegex: "unsafe-regex",
unsafeStmt: "unsafe-stmt",
unsafeAssign: "unsafe-assign",
encodedLiteral: "encoded-literal",
shortIdentifiers: "short-identifiers",
suspiciousLiteral: "suspicious-literal",
obfuscatedCode: "obfuscated-code"
})
};

This format will not match our needs so we need to replace it with a new one. One of the idea could be:

unsafeImport: {
 code: "unsafe-import",
 i18n: "token"
}

The structure can be modified according to our needs.

Add anti-trojan detection

The goal of this task is to add a new warning to signal malicious characters in source code.

If you don't know what I'm talking about yet, here is some reading:

The verification itself is not that hard and can be inspired by the following code: https://github.com/lirantal/anti-trojan-source/blob/main/src/main.js#L4

Because the whole string is required for the analysis, i guess the code is going to be somewhere here: https://github.com/NodeSecure/js-x-ray/blob/master/index.js#L13

Add a source property to Warnings

I would like to a new source property to warnings to be able to identity who created the warning. Nowadays warnings can be generated from JS-X-Ray or Scanner (take invalid-semver warning).

The task consist of updating the TS interface and add the source in the generateWarning function.

interface WarningDefault {
kind: WarningName;
file?: string;
value: string;
location: WarningLocation | WarningLocation[];
i18n: string;
severity: "Information" | "Warning" | "Critical";
experimental?: boolean;
}

Inject custom probes in AstAnalyser class

The goal of the task is to allow to package users (developers) to inject a new custom probe using the AstAnalyser

const { AstAnalyser, JsSourceParser } = require("@nodesecure/js-x-ray");

new AstAnalyser({
  parser: new JsSourceParser(),
  probes: [
    // Any valid probe object here
  }
});

We probably need to instanciate the ProbeRunner class outside of SourceFile

evaluate path.join for require?

The aim would be to successfully resolve dependency paths like in the following code example;

const bin = require(path.join('.', 'bin.js'))

For this given code we want:

  • Normalized dependency with ./bin.js path (normalized using UNIX path)
  • No unsafe-import warning

There is possibility for having identifiers like __dirname to appear

const bin = require(path.join(__dirname, 'bin.js'))

The challenge here is to compute the location of the file to dynamically inject the __dirname value (Maybe to handle in a second evolution).

Implement a new way to count and store occurences of Nodes

Current SourceFile class implement a bunch of counter and also identifiersName that keep the trace of all kind of declaration identifiers across the file.

varkinds = { var: 0, let: 0, const: 0 };
idtypes = { assignExpr: 0, property: 0, variableDeclarator: 0, functionDeclaration: 0 };
counter = {
identifiers: 0,
doubleUnaryArray: 0,
computedMemberExpr: 0,
memberExpr: 0,
deepBinaryExpr: 0,
encodedArrayValue: 0
};

Today, these "counters" are updated in probes that are often used for no other purpose.

function main(mainNode, options) {
const { sourceFile } = options;
sourceFile.varkinds[mainNode.kind]++;
for (const node of mainNode.declarations) {
sourceFile.idtypes.variableDeclarator++;
for (const { name } of getVariableDeclarationIdentifiers(node.id)) {
sourceFile.identifiersName.push({ name, type: "variableDeclarator" });
}
}
}

Or

function main(node, options) {
const { sourceFile } = options;
kIdExtractor(
({ name }) => sourceFile.identifiersName.push({ name, type: "params" }),
node.params
);
if (node.id === null || node.id.type !== "Identifier") {
return;
}
sourceFile.idtypes.functionDeclaration++;
sourceFile.identifiersName.push({ name: node.id.name, type: "functionDeclaration" });
}

For me, these should not be managed by probes but by another mechanism/abstraction

const isIdentifier = (node) => node !== null && node.type === "Identifier";

const nc = new NodeCounter("FunctionDeclaration", (node) => isIdentifier(node.id));
console.log(nc.type); // FunctionDeclaration

nc.walk(astNode);

console.log(nc.nodes); // integer

Then we could probably create a CounterAggregator or something like that to get an Object at the end.

But we still need a way to also manage identifiersName.

SyntaxError: Illegal return statement

A static analysis of the following code: https://unpkg.com/[email protected]/bin/which.js throw ParseError [SyntaxError]: [15:8]: Illegal return statement

here is a minimal reproduction

const argv = process.argv.slice(2);

function foobar() {
  console.log("foobar");
}

if (!argv.length) {
  return foobar();
}

This code is legal in Node.js CJS because the module is wrapped in function (I think).

Remove mockedFunction for Node.js test runner mock fn

Use the Node.js test runner mock.fn instead of the current mockedFunction (we dit that because mock was not available at the time).

export function mockedFunction() {
return {
called: 0,
args: [],
at(position) {
return this.args[position];
},
haveBeenCalledTimes(count = 0) {
return this.called === count;
},
haveBeenCalledWith(value) {
return this.args.includes(value);
},
callback(...args) {
this.args.push(...args);
this.called++;
}
};
}

[AstAnalyser] Add initialize and finalize callback to manipulate SourceFile

Current AstAnalyser as flaw when we want to manipulate the SourceFile class before or after walking the AST, here is an example where we add new custom tracing;

import { AstAnalyser } from "@nodesecure/js-x-ray";

const scanner = new AstAnalyser();

scanner.analyse("const foo = 'bar';", {
 initialize(sourceFile) {
   sourceFile.tracer.trace("...");
 },
 finalize(sourceFile) {
   // do something
 }
});

Those callbacks should be triggered before and after walking the AST

image

New npmjs.com Release?

Hey NodeSecure,

We at the @CybercentreCanada love JS-X-Ray and have it tightly integrated into our JavaScript analysis service JsJaws. However, it looks like there is work being done on this project that has not been released to npmjs.com? The release version there is 3.2.0 but it looks like you are on v5.1.0. Any chance you can update this on npmjs.com?

πŸ‡¨πŸ‡¦

Customizable SourceParser

The goal of the task is to be able to provide a custom SourceParser to the runASTAnalysis function (if not provided it will take the default JsSourceParser for example).

The idea behind that is to allow anyone to extend/add a new Parsing mechanism (to support TypeScript source for example).

In my mind I see two build-in class:

  • SourceParser (with the current constructor code)

constructor(source, options = {}) {
if (typeof source !== "string") {
throw new TypeError("source must be a string");
}
const { removeHTMLComments = false } = options;
this.raw = source;
/**
* if the file start with a shebang then we remove it because meriyah.parseScript fail to parse it.
* @example
* #!/usr/bin/env node
*/
const rawNoShebang = source.charAt(0) === "#" ?
source.slice(source.indexOf("\n") + 1) : source;
this.source = removeHTMLComments ?
this.#removeHTMLComment(rawNoShebang) : rawNoShebang;
}

  • JsSourceParser extending SourceParser (with the current parseScript code). The default parser for JS-X-Ray.

https://github.com/NodeSecure/js-x-ray/blob/master/src/SourceParser.js#L50

If someone want to re-implement his own, it would look like this;

import { SourceParser, runASTAnalysis } from "@nodesecure/js-x-ray";
import { parse } from '@typescript-eslint/typescript-estree';

export class TsSourceParser extends SourceParser {
  parseScript() {
    const ast = parse(this.source, {});

    return ast;
  }
}

const { warnings, dependencies } = runASTAnalysis(
  readFileSync("./file.ts", "utf-8"),
  {
    sourceParser: TsSourceParser
  }
);

Detect shady links

My idea of this task is to implement a new warning responsible of detecting shady links in Literals. I have been inspired by one of the detection of guarddog from DataDog.

They use the following RegEx:
(http[s]?:\/\/.*\.(link|xyz|tk|ml|ga|cf|gq|pw|top|club|mw|bd|ke|am|sbs|date|quest|cd|bid|cd|ws|icu|cam|uno|email|stream))

This RegEx allow to detect an URL like that:

https://foobar.xyz

The idea is to create a new probe or to update isLiteral probe to add that new warning. The probe will execute on every ESTree Literal.

const maliciousUrl = "https://foobar.xyz";

Maybe we need to conduct additional research on the subject (maybe there is some study we may want to read to improve the detection?).

[AstAnalyser API] Implement analyseFileSync

The idea is to implement a synchronous version of analyseFile to simply scenarios where we don't need an Asynchronous I/O. Recent API updates would force to switch some sync API to async (requiring ESM and Top-level-await).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.