Giter Site home page Giter Site logo

richardo2016-forks / es-module-lexer Goto Github PK

View Code? Open in Web Editor NEW

This project forked from guybedford/es-module-lexer

0.0 0.0 0.0 2 MB

Low-overhead lexer dedicated to ES module parsing for fast analysis

License: MIT License

JavaScript 97.00% Makefile 0.13% C 2.86%

es-module-lexer's Introduction

ES Module Lexer

Build Status

A JS module syntax lexer used in es-module-shims.

Outputs the list of exports and locations of import specifiers, including dynamic import and import meta handling.

A very small single JS file (4KiB gzipped) that includes inlined Web Assembly for very fast source analysis of ECMAScript module syntax only.

For an example of the performance, Angular 1 (720KiB) is fully parsed in 5ms, in comparison to the fastest JS parser, Acorn which takes over 100ms.

Comprehensively handles the JS language grammar while remaining small and fast. - ~10ms per MB of JS cold and ~5ms per MB of JS warm, see benchmarks for more info.

Usage

npm install es-module-lexer

For use in CommonJS:

const { init, parse } = require('es-module-lexer');

(async () => {
  // either await init, or call parse asynchronously
  // this is necessary for the Web Assembly boot
  await init;

  const [imports, exports] = parse('export var p = 5');
  exports[0] === 'p';
})();

An ES module version is also available:

import { init, parse } from 'es-module-lexer';

(async () => {
  await init;

  const source = `
    import { name } from 'mod';
    import json from './json.json' assert { type: 'json' }
    export var p = 5;
    export function q () {

    };

    // Comments provided to demonstrate edge cases
    import /*comment!*/ ('asdf', { assert: { type: 'json' }});
    import /*comment!*/.meta.asdf;
  `;

  const [imports, exports] = parse(source, 'optional-sourcename');

  // Returns "mod"
  imports[0].n
  source.substring(imports[0].s, imports[0].e);
  // "s" = start
  // "e" = end

  // Returns "import { name } from 'mod'"
  source.substring(imports[0].ss, imports[0].se);
  // "ss" = statement start
  // "se" = statement end

  // Returns "{ type: 'json' }"
  source.substring(imports[1].a, imports[1].se);
  // "a" = assert

  // Returns "p,q"
  exports.toString();

  // Dynamic imports are indicated by imports[2].d > -1
  // In this case the "d" index is the start of the dynamic import
  // Returns true
  imports[2].d > -1;

  // Returns "asdf"
  imports[2].n
  // Returns "'asdf'"
  source.substring(imports[2].s, imports[2].e);
  // Returns "import /*comment!*/ ("
  source.substring(imports[2].d, imports[2].s);
  // Returns "import /*comment!*/ ('asdf', { assert: { type: 'json' } })"
  source.substring(imports[2].d, imports[2].se + 1);
  // Returns "{ assert: { type: 'json' } }"
  source.substring(imports[2].a, imports[2].e);
  // ss is the same as d
  // as, ae not used for dynamic imports

  // import.meta is indicated by imports[2].d === -2
  // Returns true
  imports[2].d === -2;
  // Returns "import /*comment!*/.meta"
  source.substring(imports[2].s, imports[2].e);
})();

CSP asm.js Build

The default version of the library uses Wasm and (safe) eval usage for performance and a minimal footprint.

Neither of these represent security escalation possibilities since there are no execution string injection vectors, but that can still violate existing CSP policies for applications.

For a version that works with CSP eval disabled, use the es-module-lexer/js build:

import { parse } from 'es-module-lexer/js';

Instead of Web Assembly, this uses an asm.js build which is almost as fast as the Wasm version (see benchmarks below).

Escape Sequences

To handle escape sequences in specifier strings, the .n field of imported specifiers will be provided where possible.

For dynamic import expressions, this field will be empty if not a valid JS string.

Facade Detection

Facade modules that only use import / export syntax can be detected via the third return value:

const [,, facade] = parse(`
  export * from 'external';
  import * as ns from 'external2';
  export { a as b } from 'external3';
  export { ns };
`);
facade === true;

Environment Support

Node.js 10+, and all browsers with Web Assembly support.

Grammar Support

  • Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
  • Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
  • Always correctly parses valid JS source, but may parse invalid JS source without errors.

Limitations

The lexing approach is designed to deal with the full language grammar including RegEx / division operator ambiguity through backtracking and paren / brace tracking.

The only limitation to the reduced parser is that the "exports" list may not correctly gather all export identifiers in the following edge cases:

// Only "a" is detected as an export, "q" isn't
export var a = 'asdf', q = z;

// "b" is not detected as an export
export var { a: b } = asdf;

The above cases are handled gracefully in that the lexer will keep going fine, it will just not properly detect the export names above.

Benchmarks

Benchmarks can be run with npm run bench.

Current results for a high spec machine:

Wasm Build

Module load time
> 5ms
Cold Run, All Samples
test/samples/*.js (3123 KiB)
> 20ms

Warm Runs (average of 25 runs)
test/samples/angular.js (739 KiB)
> 2.12ms
test/samples/angular.min.js (188 KiB)
> 1ms
test/samples/d3.js (508 KiB)
> 3.04ms
test/samples/d3.min.js (274 KiB)
> 2ms
test/samples/magic-string.js (35 KiB)
> 0ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (929 KiB)
> 4.04ms
test/samples/rollup.min.js (429 KiB)
> 2.16ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3123 KiB)
> 14.4ms

JS Build (asm.js)

Module load time
> 2ms
Cold Run, All Samples
test/samples/*.js (3123 KiB)
> 35ms

Warm Runs (average of 25 runs)
test/samples/angular.js (739 KiB)
> 3ms
test/samples/angular.min.js (188 KiB)
> 1.08ms
test/samples/d3.js (508 KiB)
> 3.04ms
test/samples/d3.min.js (274 KiB)
> 2ms
test/samples/magic-string.js (35 KiB)
> 0ms
test/samples/magic-string.min.js (20 KiB)
> 0ms
test/samples/rollup.js (929 KiB)
> 5.04ms
test/samples/rollup.min.js (429 KiB)
> 3ms

Warm Runs, All Samples (average of 25 runs)
test/samples/*.js (3123 KiB)
> 17ms

Building

To build download the WASI SDK 12.0 from https://github.com/WebAssembly/wasi-sdk/releases/tag/wasi-sdk-12.

The Makefile assumes the existence of "wasi-sdk-12.0" and "wabt" (optional) as sibling folders to this project.

The build through the Makefile is then run via make lib/lexer.wasm, which can also be triggered via npm run build:wasm to create dist/lexer.js.

After the Web Assembly build, the CJS build can be triggered via npm run build.

License

MIT

es-module-lexer's People

Contributors

guybedford avatar richardo2016 avatar larsdenbakker avatar fredkschott avatar marvinhagemeister avatar xiaoxiangmoe avatar alex-saunders avatar aomarks avatar mrbananalord avatar dakmor avatar tomeraberbach avatar csr632 avatar ycjcl868 avatar boenfu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.