sra-siliconvalley / jalangi Goto Github PK

Available for legacy purposes. New users please see Jalangi2 https://github.com/Samsung/jalangi2

License: Other

Python 0.93% Shell 0.10% JavaScript 89.01% TeX 2.14% HTML 2.80% Perl 0.04% Java 0.06% CSS 4.93%

jalangi's Introduction

Jalangi

We encourage you to switch to Jalangi2 available at https://github.com/Samsung/jalangi2. Jalangi2 is a framework for writing dynamic analyses for JavaScript. Jalangi2 does not support the record/replay feature of Jalangi1. Jalangi1 is still available from this website, but we no longer plan to develop it.

Introduction

Jalangi is a framework for writing heavy-weight dynamic analyses for JavaScript. Jalangi provides two modes for dynamic program analysis: an online mode (a.k.a direct or inbrowser analysis mode)and an offilne mode (a.k.a record-replay analysis mode). In both modes, Jalangi instruments the program-under-analysis to insert callbacks to methods defined in Jalangi. An analysis writer implements these methods to perform custom dynamic program analysis. In the online mode, Jalangi performs analysis during the execution of the program. An analysis in online mode can use shadow memory to attach meta information with every memory location. The offilne mode of Jalangi incorporates two key techniques: 1) selective record-replay, a technique which enables to record and to faithfully replay a user-selected part of the program, and 2) shadow values and shadow execution, which enables easy implementation of heavy-weight dynamic analyses. Shadow values allow an analysis to attach meta information with every value. In the distribution you will find several analyses:

concolic testing,
an analysis to track origins of nulls and undefined,
an analysis to infer likely types of objects fields and functions,
an analysis to profile object allocation and usage,
a simple form of taint analysis,
an experimental pure symbolic execution engine (currently undocumented)

A very old demo of Jalangi integrated with the Tizen IDE is available at http://srl.cs.berkeley.edu/~ksen/jalangi.html. Note that the IDE plugin is not open-source. Slides describing the internals of Jalangi are available at http://srl.cs.berkeley.edu/~ksen/slides/jalangi-jstools13.pdf and our first paper on Jalangi is available at http://srl.cs.berkeley.edu/~ksen/papers/jalangi.pdf.

Requirements

We tested Jalangi on Mac OS X 10.8 with Chromium browser. Jalangi should work on Mac OS 10.7, Ubuntu 11.0 and higher and Windows 7 or higher. Jalangi will NOT work with IE.

Latest version of Node.js available at http://nodejs.org/. We have tested Jalangi with Node v0.10.25.
Sun's JDK 1.6 or higher. We have tested Jalangi with Java 1.6.0_43.
Command-line git.
libgmp (http://gmplib.org/) is required by cvc3. Concolic testing uses cvc3 and automaton.jar for constraint solving. The installation script checks if cvc3 and automaton.jar are installed properly.
Chrome browser if you need to test web apps.
Python (http://python.org) version 2.7 or higher

On Windows you need the following extra dependencies:

Install Microsoft Visual Studio 2010 (Free express version is fine).
If on 64bit also install Windows 7 64-bit SDK.

If you have a fresh installation of Ubuntu, you can install all the requirements by invoking the following commands from a terminal.

sudo apt-get update
sudo apt-get install python-software-properties python g++ make
sudo add-apt-repository ppa:chris-lea/node.js
sudo apt-get update
sudo apt-get install nodejs
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer
sudo update-java-alternatives -s java-7-oracle
sudo apt-get install git
sudo apt-get install libgmp10
sudo apt-get install chromium-browser

Installation

python ./scripts/install.py

If Installation succeeds, you should see the following message:

---> Installation successful.
---> run 'npm test' to make sure all tests pass

A Lubuntu virtual machine with pre-installed Jalangi (very old version) can be downloaded from http://srl.cs.berkeley.edu/~ksen/jalangi4.zip. You need VirtualBox available at https://www.virtualbox.org/ to run the virtual machine. Login and password for the jalangi account on the machine are jalangi and jalangi, respectively. Open a terminal, go to directory jalangi, and try ./scripts/testsym.

Run Tests

Check if record and replay executions produce same output on some unit tests located under tests/unit/.

./node_modules/.bin/mocha --reporter spec node_test/unitTests.js

The above runs Jalangi both with no analysis and with a trivial analysis that wraps all values. To run the same tests over the sunspider benchmarks, use the following command:

./node_modules/.bin/mocha --reporter spec node_test/sunspiderTests.js

Run concolic testing tests.

python ./scripts/sym.py

To run the entire test suite, simply run:

npm test

Run Browser Tests

Some automated tests can be run in the browser, using Selenium. To run these tests, first install the relevant dependencies:

sudo python scripts/install-dev.py

(sudo is needed to install the Python Selenium bindings.) Then, to run all browser tests, do:

python scripts/runbrowsertests.py

You should see Chrome windows opening and closing as the tests are run.

What's new?

Introducing analysis2.js. analysis2.js is a new API for performing direct or in browser analysis. It has a clean, efficient, and less error-prone API compared to analysis.js. You can find more documentation in docs/analysis2.md.

Other Scripts

Run likely type inference analysis on the sunspider benchmarks located under tests/sunspider1/.

python scripts/testsp_likelytype.py

Run tracker of origin of null and undefined on the sunspider benchmarks located under tests/sunspider1/.

python scripts/testsp_tracknull.py

Run a simple heap profiler on the sunspider benchmarks located under tests/sunspider1/.

python scripts/testsp_heapprofiling.py

Record an execution of tests/unit/qsort.js and create jalangi_trace.html which when loaded in a browser replays the execution.

./scripts/browserReplay tests/unit/qsort; path-to-chrome-browser jalangi_trace.html

Concolic testing

To perform concolic testing of some JavaScript code present in a file, say testme.js, insert the following 4 lines at the top of the file.

if (typeof window === "undefined") {
    require('../../src/js/InputManager');
    require(process.cwd()+'/inputs');
}

In the code, use J$.readInput(arg) to indicate the inputs to the program. Then run the following command to perform concolic testing:

python scripts/jalangi.py concolic -i 100000 testme

The -i argument bounds the total number of test inputs. The command generates a set of input files in the directory jalangi_tmp. The input files start with the prefix jalangi_inputs. Once the inputs are generated, you can run testme.js on those inputs by giving the following command:

 python scripts/jalangi.py rerunall testme

For example, open the file tests/unit/qsort.js and check how inputs are specified. Then run

 python scripts/jalangi.py concolic tests/unit/qsort 100
 python scripts/jalangi.py rerunall tests/unit/qsort

Open the file tests/unit/regex8.js and check how string inputs are specified. Then run

 python scripts/jalangi.py concolic tests/unit/regex8 100
 python scripts/jalangi.py rerunall tests/unit/regex8

Dynamic analysis

The JavaScript code in src/js/analyses/objectalloc/ObjectAllocationTrackerEngine.js implements a simple analysis that reports the number of objects created during an execution along with some auxiliary information. The analysis can be performed on a file testme.js by invoking the following command:

python scripts/jalangi.py analyze -a src/js/analyses/objectalloc/ObjectAllocationTrackerEngine testme

For example, try running the analysis on a sunspider benchmark by issuing the following command:

python scripts/jalangi.py analyze -a src/js/analyses/objectalloc/ObjectAllocationTrackerEngine tests/sunspider1/crypto-aes

Similarly, you can run a likely type inference analysis on another sunspider benchmark by calling the following command and you will notice some warnings.

python scripts/jalangi.py analyze -a src/js/analyses/likelytype/LikelyTypeInferEngine tests/sunspider1/crypto-sha1

Run the following to perform a simple form of taint analysis.

python scripts/jalangi.py analyze -a src/js/analyses/simpletaint/SimpleTaintEngine tests/sunspider1/crypto-sha1

You can run origin of null and undefined tracker on a toy example by issuing the following command:

python scripts/jalangi.py analyze -a src/js/analyses/trackundefinednull/UndefinedNullTrackingEngine tests/unit/track_undef_null

Chaining Dynamic Analyses

python scripts/jalangi.py direct --analysis src/js/analyses/ChainedAnalyses.js --analysis src/js/analyses/dlint/UndefinedOffset.js --analysis src/js/analyses/dlint/ShadowProtoProperty.js tests/unit/dlint1

Record and replay a web application.

Jalangi provides a script for instrumenting a locally-stored web application, by instrumenting all discovered scripts on disk. Here is how to instrument the annex app using this script. First, run the instrument.js script to instrument the app:

node src/js/commands/instrument.js --outputDir /tmp tests/tizen/annex

This creates an instrumented copy of annex in /tmp/annex. To see other options for instrument.js, run it with the -h option.

Then, launch the Jalangi server and the HTML page by running

killall node
python scripts/jalangi.py rrserver file:///tmp/annex/index.html

You can now play the game for sometime. Try two moves. This will generate a jalangi_trace1 in the current directory. To ensure the trace is completely flushed, press Alt+Shift+T in the browser, and then close the browser window. You can run a dynamic analysis on the trace file by issuing the following commands (not that this differs slightly from above, due to the need to copy the trace):

cp jalangi_trace1 /tmp/annex
node src/js/commands/replay.js --tracefile /tmp/annex/jalangi_trace1 --analysis src/js/analyses/objectalloc/ObjectAllocationTrackerEngine

Record and replay using the proxy server.

Jalangi also provides a proxy server to instrument code from live web sites. Here is how to instrument the annex app from above using the proxy server.

First start a HTTP server by running the following command. The command starts a simple Python based http server.

python scripts/jalangi.py server &

Then, launch the combined proxy and Jalangi record-replay server.

node src/js/commands/jalangi_proxy.js

You will see output like the following:

writing output to /tmp/instScripts/site0
listening on port 8501
Fri Dec 13 2013 16:02:34 GMT-0800 (PST) Server is listening on port 8080

The proxy server is listening on port 8501, and the record-replay server on port 8080. Instrumented scripts and the trace file will be written to /tmp/instScripts/site0.

Now, configure your browser to use the proxy server. The procedure will vary by operating system and by browser.
For browsers on Mac OS X, you can set the proxy server for a network adapter Wi-Fi with the following command:

sudo networksetup -setwebproxy Wi-Fi 127.0.0.1 8501 off

To stop using the proxy, run sudo networksetup -setwebproxystate Wi-Fi off.

Now, open Chrome and navigate to http://127.0.0.1:8181/tests/tizen/annex/index.html (not index_jalangi_.html). You can now play the game for sometime. Try two moves. This will generate a jalangi_trace1 in the output directory /tmp/instScripts/site0. To ensure the trace is completely flushed, press Alt+Shift+T in the browser, and then close the browser window. Once you are done playing, kill the proxy server process to complete dumping of certain metadata.

Now, you can run a dynamic analysis on the trace file by issuing the following commands.

node src/js/commands/replay.js --tracefile /tmp/instScripts/site0/jalangi_trace1 --analysis src/js/analyses/objectalloc/ObjectAllocationTrackerEngine

Further examples of record and replay

node src/js/commands/instrument.js --outputDir /tmp tests/tizen/calculator

killall node
python scripts/jalangi.py rrserver file:///tmp/calculator/index.html

cp jalangi_trace1 /tmp/calculator
node src/js/commands/replay.js --tracefile /tmp/calculator/jalangi_trace1 --analysis src/js/analyses/likelytype/LikelyTypeInferEngine

node src/js/commands/instrument.js --outputDir /tmp tests/tizen/go

killall node
python scripts/jalangi.py rrserver file:///tmp/go/index.html

cp jalangi_trace1 /tmp/go
node src/js/commands/replay.js --tracefile /tmp/go/jalangi_trace1 --analysis src/js/analyses/likelytype/LikelyTypeInferEngine

In browser analysis of a web application.

Jalangi allows to run an analysis, which does not use ConcolicValue, in a browser. Here is how to instrument the annex app with an inbrowser analysis. First, run the instrument.js script to instrument the app:

node src/js/commands/instrument.js --inbrowser --smemory --analysis src/js/analyses/logNaN/LogNaN.js --outputDir /tmp tests/tizen/annex

This creates an instrumented copy of annex in /tmp/annex. To see other options for instrument.js, run it with the -h option.

Then, open the HTML page in a browser (tested on Chrome) by running

open file:///tmp/annex/index.html

You can now play the game for sometime. Try two moves and see the console output after pressing Shift-Alt-T. In the in-browser mode, one must not use ConcolicValue to wrap a program value. However, one could use shadow execution to collect statistics. Shadow memory is supported in the "inbrowser" mode. Shadow memory library can be accessed in an analysis via J$.Globals.smemory. smemory.getShadowObject(obj) returns the shadow object associated with obj if type of obj is "object" or "function". smemory.getFrame(varName) returns the frame that contains the variable named "varName".

The following shows how to run the object allocation tracker analysis on the annex game. After playing the game for some time, press Shift-Alt-T to print the analysis results on the console.

node src/js/commands/instrument.js --inbrowser --smemory --analysis src/js/analyses/objectalloc/ObjectAllocationTrackerEngineIB.js --outputDir /tmp tests/tizen/annex
open file:///tmp/annex/index.html

The following shows how to run the likely type inference analysis on the annex game. After playing the game for some time, press Shift-Alt-T to print the analysis results on the console.

node src/js/commands/instrument.js --inbrowser --smemory --analysis analyses/likelytype/LikelyTypeInferEngineIB.js --outputDir /tmp tests/tizen/annex
open file:///tmp/annex/index.html

jalangi's People

Contributors

Stargazers

Watchers

jalangi's Issues

use package.json to install node dependencies

We should use a package.json file + npm install for installing dependent node packages, rather than the current strategy used in install.py.

problem in concolic testing

get exception after running the command: python scripts/jalangi.py concolic -i 100000 tests/fail_case

code of fail_case.js:

function foo(input){
    if(input[2] === 'r') {
        1;
    } else {
        2;
    }
}
foo();

exception I got:

---- Instrumenting ../tests/multiex/fail_case ----
Instrumenting ../tests/multiex/fail_case.js ...
==== Input 0 ====
---- Recording execution of tests/multiex/fail_case ----
fail_case_jalangi_.js
TypeError: Cannot read property '2' of undefined
    at Object.G (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:499:33)
    at foo (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/tests/multiex/fail_case_jalangi_.js:13:59)
    at invokeFun (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:451:37)
    at /Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:570:28
    at Object.<anonymous> (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/tests/multiex/fail_case_jalangi_.js:28:57)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)

---- Replaying tests/multiex/fail_case ----
./analyses/concolic/SymbolicEngine
TypeError: Cannot read property '2' of undefined
    at Object.G (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:499:33)
    at foo (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/tests/multiex/fail_case_jalangi_.js:13:59)
    at invokeFun (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:451:37)
    at /Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/src/js/analysis.js:570:28
    at Object.<anonymous> (/Users/jacksongl/macos-workspace/research/jalangi/github_multiex/repository/jalangi/tests/multiex/fail_case_jalangi_.js:28:57)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)

getting exception running raytrace.js when analysis code uses conditional API

something is wrong with the conditional API

get exception after running the instrumented tests/octane/raytrace.js with the following analysis code:

The following code is Dummy.js

  J$.analysis = {};
  ((function (sandbox){

    function Dummy() {
        // during a conditional expression evaluation
        // result_c is the evaluation result and should be returned
        this.conditional = function (iid, left, result_c) {
            return result_c;
        }
    }

    if (sandbox.Constants.isBrowser) {
        sandbox.analysis = new Dummy();
        window.addEventListener('keydown', function (e) {
            // keyboard shortcut is Alt-Shift-T for now
            if (e.altKey && e.shiftKey && e.keyCode === 84) {
                sandbox.analysis.endExecution();
            }
        });
    } else {
        module.exports = Dummy;
    }
  })(typeof J$ === 'undefined'? (J$={}):J$));

proposed fix in analysis.js (below !!! comment):

            function C(iid, left) {
                var left_c, ret;
                executionIndex.executionIndexInc(iid);
                if (sandbox.analysis && sandbox.analysis.conditionalPre) {
                    sandbox.analysis.conditionalPre(iid, left);
                }

                left_c = getConcrete(left);
                ret = !!left_c;

                if (sandbox.analysis && sandbox.analysis.conditional) {
                    // !!!!!!!!  in the following line change ret to left_c
                    lastVal = sandbox.analysis.conditional(iid, left, ret);  
                    if (rrEngine) {
                        rrEngine.RR_updateRecordedObject(lastVal);
                    }
                } else {
                    lastVal = left_c;
                }

                if (branchCoverageInfo) {
                    branchCoverageInfo.updateBranchInfo(iid, ret);
                }

                printValueForTesting("J$.C ", iid, left_c ? 1 : 0);
                return left_c;
            }

problem in analyses/puresymbolic

Run both Single2 and Multiple on the following test case generates two test cases

function foo(input){
    if(input[2] === 'r') {
        1;
    } else {
        2;
    }
}
foo();

the first test case is correct (take the else branch):

J$.setCurrentSolutionIndex([]);
J$.setCurrentSolution({"x2":"","x2__length":0});
J$.setInput("x2","");

but the second test case is wrong (suppose to take the then branch):

J$.setCurrentSolutionIndex([]);
J$.setCurrentSolution({"x10":"r","x10__length":1,"x10__0":114});
J$.setInput("x10","r");

issue with setters and mix of instrumented and uninstrumented code

There is a bug in replay involving setters and a mix of instrumented and uninstrumented code. To reproduce, consider the following three files:

a.js:

exports.foo = function (x,y) {
  x.g = y;
}

b.js:

var a = require('./a');

exports.baz = {
  set p(y) {
    a.foo(this,y);
  }
}

c.js:

var b = require('./b');

var baz = b.baz;
baz.p = 7;

console.log(baz.g);

Now, say that a.js and c.js are instrumented, but not b.js. In order to make this work, I instrument a.js and c.js using esnstrument.js, and then create the following file b_mod_.js:

var a = require('./a_jalangi_');

exports.baz = {
  set p(y) {
    a.foo(this,y);
  }
}

I then hack c_jalangi_.js to pass ./b_mod_ to the require call instead of ./b. Anyway, with this setup, during replay, I get a path deviation:

Error: Path deviation at record = [5,2,21,12,4] iid = 105 index = 11
    at checkPath (/Users/m.sridharan/git-repos/jalangi/src/js/RecordReplayEngine.js:305:27)
    at RR_L (/Users/m.sridharan/git-repos/jalangi/src/js/RecordReplayEngine.js:645:17)
    at RR_R (/Users/m.sridharan/git-repos/jalangi/src/js/RecordReplayEngine.js:509:53)
    at Object.R (/Users/m.sridharan/git-repos/jalangi/src/js/analysis.js:731:36)
    at Object.<anonymous> (/Users/m.sridharan/test/c_jalangi_.js:10:124)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)

I think somehow we're not considering the possibility that a putfield can invoke a native function via a setter, which in turn invokes instrumented code. Sorry for the tricky steps to reproduce, I don't have time to make it nicer right now.

top-level expressions wrong for function nested in object literal

Consider the following code:

var x = {
  foo: function() {
    fizz();
  }
};

The top-level expressions here are both the object literal expression assigned to x and the call to fizz() inside the function assigned to the foo property. Currently Jalangi only reports the outer object literal expression as top-level, however. I think we need some extra logic for nested functions.

I've added a test in node_test/topLevelExprTests.js for this. To run, uncomment lines 69--71, and then run ./node_modules/.bin/mocha --reporter spec node_test/topLevelExprTests.js.

Separate instrumentation of source files

It'd be good to be able to separately instrument source files, so that if one file changes, we don't need to re-instrument everything else. The key issue is maintaining unique IIDs for client analyses. There are a couple possible approaches:

change the kind of IIDs generated by the instrumenter to include the current file name. E.g., it could be a concatenated string of the containing file name and the integer IDs we currently generate.
leave the instrumenter as is, but somehow have the replay analysis modify the IIDs passed to client analyses so that they are unique across files.

The tricky thing with option 2 is figuring out what application script is currently executing. I think one could do it by creating an Error object and parsing the stack trace, but it could affect performance (I think we'd need to do it at least at every function entry, to detect when the invoked function is in a different script.) The downside of option 1 is that it could bloat the instrumented code size.

Our best thought so far is a hybrid approach:

Change the instrumenter to invoke some setCurrentFile() callback (name appropriately shortened) at any point that the currently executing JS file may have changed. These points include script entry, function entry, and the top of a catch block (any others?).
Change analysis.js to keep track of the current file based on the above callbacks, and then appropriately combine the current file with the original IID into a new unique IID that gets passed to the client analysis.

Of course, we'll have to change IIDInfo.js to parse multiple sourcemap files, know about the IIDs generated by analysis.js, etc.

Given that the above change would be a moderate amount of work, I will probably put it aside until it's more urgent, unless someone has an idea of a much simpler approach to the problem; suggestions welcome.

Properties move down the prototype chain on replay

function G(x) {}
G.prototype.p = function f() {}
var y = new G()
y.p()
console.log(y)

Run normally this snippet prints "{}".

If a trace is recorded and replayed with the NOP analysis, "{ p: [Function: f] }" is printed at replay.

I haven't had time to investigate very deep, but it appears that the p property is somehow moved from the prototype to the object itself.

handle getOwnPropertyNames

Right now, our instrumentation changes the behavior of getOwnPropertyNames, since we add a *J$* to every object. This can change the behavior of code like:

var x = {};
console.log(Object.getOwnPropertyNames(x).length);

The above is checked in as unit test getownpropnames.js.

call and apply

function foo() {}
foo.call()

When replayed this triggers one invokeFun call back for the invocation of call. However, the invocation of foo does not, as it happens as an effect of call. If the analysis tracks function calls it would miss that invocation, (but still see a functionExit callback when leaving foo).

The analysis can of course handle this manually by checking if the function being invoked is either call or apply and then take action accordingly. However, to me it seems cleaner if this happens on the Jalangi level instead of in every analysis that does something with function calls.

Taint analysis on live websites

Is there a way to run the taint analysis engine on live websites? Or should I download the website first and then run the taint analysis on it?
I know there is an example for the annex example, but what about other sites like www.yahoo.com?

Shadowing 'arguments' causes crash

function __func(arguments){
     arguments;
};

console.log(__func())

Expected output is simply "undefined" but the instrumented version of the above crashes with the following exception when run:

[TypeError: Cannot read property 'callee' of undefined]
TypeError: Cannot read property 'callee' of undefined
    at __func (/home/simonhj/src/jalangi/scratch_jalangi_.js:17:51)
    at invokeFun (/home/simonhj/src/jalangi/src/js/analysis.js:499:33)
    at /home/simonhj/src/jalangi/src/js/analysis.js:590:24
    at Object.<anonymous> (/home/simonhj/src/jalangi/scratch_jalangi_.js:34:224)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)

The culprit is the following line in the instrumented version:

J$.Fe(9, arguments.callee, this);

A quick google search revealed no other robust way to get the currently executing function and we can probably ignore this issue as a corner case for now.

tests/compos/arbitrary2 is failing

In the tests run by scripts/testmultiple.py, the tests/compos/arbitrary2 test is failing. See Travis for the output:

https://travis-ci.org/SRA-SiliconValley/jalangi/builds/21294192

how can we use the CoverageEngine

how can we use the CoverageEngine to generate the coverage report.

can't handle typescript compiler

I added the TypeScript compiler benchmark as tests/octane/typescript.js, but Jalangi crashes during record. Unfortunately, the benchmark has several functions that shadow the arguments array, exposing known issue #21. I've added a unit test tests/unit/shadow-arguments.js reduced from the compiler. Opening this issue in case other problems arise.

type conversion with ++ and --

We need to model the implicit type conversion to Number performed by the ++ and -- operators. Here is a test that currently fails with Jalangi instrumentation:

var j = "0";
var k = j++;
console.log(j);
console.log(k);
var l = "1";
var m = ++l;
console.log(l);
console.log(m);

Without instrumentation, the above program prints:

With instrumentation and in record mode, the program prints:

I captured the failing test in tests/unit/type_conversion.js on master. This is the root cause of why some jQuery unit tests fail under instrumentation.

only record trace once per regression test

Right now in our regression test suite, we do record-replay analysis of many benchmarks twice, once with the "none" replay analysis and also with the track-all-values analysis. For each analysis, we instrument and record from scratch. Instead, we should re-use the trace for both replay analyses.

Failure due to monkey patching?

I've been looking into #6 , in particular being able to record and replay just loading mootools. It doesn't work now, and I think it's due to monkey patching. I've checked in a test tests/html/unit/mootools_reduced_1.js. If you run the following command you can see the replay failure:

python scripts/jalangi.py testrr_browser tests/html/unit/mootools_reduced_1

The code is a bit complex, but it looks to me like Array.prototype.slice is being monkey-patched. I tried quickly fixing this in analysis.js but failed.

putField and putFieldPre are passed uncoerced object

In the below property lookup the object used for lookup is coerced to a string by calling toString internally in the interpreter, thus the actually field name used is '[object Object]'

var o = {}
o[{}] = {}

The call backs putFieldPre and putField are passed the original object {}, not the string.

Improve performance of esnstrument

Right now, instrumenting code is a bit slow. I've put the v8 profiling output for instrumenting pdf.js here. One concerning thing is that ~20% of time is spent in GC; would have to dig in further to see why.

Otherwise, I don't see any quick fixes. The major change I can think of would be to not transform the AST and use escodegen, but instead generate the instrumented code string directly during the AST pass. Not sure how nightmare-ish this would be, though.

Other thoughts welcome :-)

Model toString conversion of argument to console.log

As seen in #37, use of console.log can cause confusion since we don't model its conversion of its argument(s) to strings. Perhaps we should add modeling of this conversion.

serializing ASTs using JSON.stringify loses RegExp literals

Right now, code in instrumentDir.js serializes ASTs to disk using JSON.stringify(). But, this won't work for RegExp literals; see here. To fix, we can use a more general serialized representation than JSON, e.g., json-literal.

Determining when a statement is done executing

Some analyses needs to know when execution of one statement ends and another begins, but the current set of analysis callbacks does not provide this information.

Would it be possible to enhance Jalangi to also provide a statementEnd callback which gets triggered as a marker at statement boundaries? This added information could be optional, controlled by a switch to the instrumenter.

Array missing element

var x = new Array()
function f() {
    x.push({p: "foo"})
    console.log(x)
}
f()

Running normally under node this outputs:

[ { p: 'foo' } ]

However when replayed with the nop analysis, the following is printed:

---- Instrumenting /home/simonhj/src/jalangi/scratch ----
Instrumenting /home/simonhj/src/jalangi/scratch.js ...
---- Recording execution of /home/simonhj/src/jalangi/scratch ----
scratch_jalangi_.js
---- Replaying /home/simonhj/src/jalangi/scratch ----
[]
None

It appears that the element is not in the array.

handle mootools-based apps

We need to handle sites that use Mootools. This doesn't work currently because Mootools overwrites many built-in JS functions. E.g., this game doesn't work:

http://www.lbnstudio.fr/labs/tetris/test/uTetris/

We need to grab a copy of various built-in functions and then invoke through those pointers. Here's what was done in a previous project:

https://github.com/ecspat/eavesdropper/blob/master/util.js

Not sure if this will be sufficient, but it's probably a good starting point.

window.location not recorded

Record-replay fails for the following browser script:

console.log(window.location);

During replay under node, {} is printed instead of the location observed during record. The script is in tests/html/unit/window_location.js.

Annex benchmark generates a very large trace file

With only a few moves annex generates a very large trace file (>300k) lines.

refactor instrumentDir.js to be invokable as an API

Also write some unit tests for it

replay error with defineProperty and track values analysis

I checked in a test tests/unit/defineProperty.js:

var f = function () { return this; }
var x = { get: f };
function Foo() {}
Object.defineProperty(Foo.prototype, 'fizz', x);

This test fails when replay is run with the TrackAllValues analysis:

$ python scripts/jalangi.py analyze -a ./analyses/trackallvalues/TrackValuesEngine tests/unit/define_property
---- Instrumenting /Users/m.sridharan/git-repos/jalangi/tests/unit/define_property ----
Instrumenting /Users/m.sridharan/git-repos/jalangi/tests/unit/define_property.js ...
---- Recording execution of /Users/m.sridharan/git-repos/jalangi/tests/unit/define_property ----
define_property_jalangi_.js
---- Replaying /Users/m.sridharan/git-repos/jalangi/tests/unit/define_property ----
./analyses/trackallvalues/TrackValuesEngine
TypeError: Getter must be a function: function () {
                    jalangiLabel0:
                        while (true) {
                            try {
                                J$.Fe(13, arguments.callee, this);
                                arguments = J$.N(17, 'arguments', arguments, true);
                                return J$.Rt(9, J$.R(5, 'this', this, false));
                            } catch (J$e) {
                                J$.Ex(93, J$e);
                            } finally {
                                if (J$.Fr(97))
                                    continue jalangiLabel0;
                                else
                                    return J$.Ra();
                            }
                        }
                }
    at Function.defineProperty (native)
    at Function.<anonymous> (/Users/m.sridharan/git-repos/jalangi/src/js/analysis.js:165:30)
    at invokeFun (/Users/m.sridharan/git-repos/jalangi/src/js/analysis.js:429:37)
    at /Users/m.sridharan/git-repos/jalangi/src/js/analysis.js:548:28
    at Object.<anonymous> (/Users/m.sridharan/git-repos/jalangi/tests/unit/define_property_jalangi_.js:42:174)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Module.require (module.js:364:17)

None

The issue is that native method Object.defineProperty() gets executed during replay, since it is white-listed, but we don't unwrap the function stored in the get property of the descriptor object.

Change how IID map is written by esnstrument.js

Right now, the IID map file is written line-by-line; see here. Instead, we should probably separate the logic collecting IID map data from the writing of the file and the file format. This would be a temporary, simpler change, in lieu of fixing #9.

Nonexistent iid passed to functionExit callback

function foo() {
    return {}
}
foo()

In the instrumented version of this snippet, J$.Fr is called with the iid 45 which does not appear in the jalangi_sourcemap.js file. This then gets passed into client analyses using the functionExit callback.

handle ++ on strings in normalization

Consider the following code:

var j = "0";
var k = j++;
console.log(j);
console.log(k);
var l = "1";
var m = ++l;
console.log(l);
console.log(m);

When run under node, the output is:

But after normalization, we get:

Gotta love JavaScript.

incorrect replay of type coercion of Date object

We currently get replay failures for the following test:

var x = new Date();
setTimeout(function () {
  var z = new Date() - x;
  console.log(String(z));
}, 1000);

The failure depends on whether z happens to have the same value during record and replay. I think z may differ since we are not capturing the semantics of the - operator on Date objects. The test is checked in as tests/unit/date-conversion.js.

No such file or directory: node_modules/escodegen/escodegen.browser.js

when I run command line: python ./scripts/install.py then I have error this. I try run it on Ubuntu and Win7 but still don't work. The following image when I run on Win7 command and Ubuntu. Another thing, In folder [node_modules\escodegen] I just saw a file [escodegen.js] that does not show files [escodegen.browser.js]
Please help me.

instrumentDir.js places serialized ast files in CWD.

node src/js/commands/instrumentDir.js -si foo/bar baz/boo

This commands outputs the *.ast.json files in the current working directory instead of of in baz/boo.

problematic assignment operation added by wrapReadWithUndefinedCheck

A website does not work after transformation. And it turns out that it is because wrapReadWithUndefinedCheck function adds variable = variable operation during transformation. For example, the following statement:

postArgMessage; will be transformed into:

J$.I(typeof postArgMessage === 'undefined' ? postArgMessage = J$.R(5, 'postArgMessage', undefined, true) : postArgMessage = J$.R(5, 'postArgMessage', postArgMessage, true)

which assigns the return value (its own value, if the analysis code does not modify that) to the variable. That can be problematic sometime, for example, execute:

location = location;

in the frontend means reloading the webpage. (This bug has been fixed).
But There are still other special global objects that should not be assigned to itself. For example, in HTML5 webworker (fontend multithreading environment), execute:

self = self;

will cause 'setting a property that has only a getter' exception.

Maybe this is not a serious bug as webworker is not supposed to be supported by Jalangi. But there might be other special objects that can trigger errors if they are assigned to themselves. That might potentially cause more bugs which are hard to diagnose.

JacksonGL

make install.py incremental

I don't see a need for install.py to blow away all the files you already have each time it runs; we should make it only install the things you don't have. This is helpful, e.g., in cases where the install fails half-way through. Now, if you re-run, it starts from scratch.

problem in RR

record/replay fails for the following code. Extracted from the typescript benchmark.

Date.prototype;
var start = +new Date();

return_ invoked after functionExit

As far as I can see (by judicious use of console.log) the functionExit callback is called before the return_ callback. In my intuition this seems backwards as functionExit sounds like the last you should hear from a function. It might make more sense to have just one callback giving both return value and IID.

move symbolic analysis to separate repository

We should move the symbolic analysis code in Jalangi to a separate git repository. This will make it easier to install Jalangi for those not doing symbolic analysis, since auxiliary tools like cvc3 won't need to be installed. Plus, it will force us to clean up the design a little bit.

Function call in finally block causes undefined return value

function f() {}
function testcase() {
    try {
        return true;        
    } finally {
        f()
    }
}
console.log(testcase());

The expected output of this snippet is true. However when running the instrumented version (record mode, no analysis) undefined is printed instead.

This was derived from a test262 testcase.

functionEnter is passed only concrete function

The functionEnter callback receives a reference to the concrete function (not the whole ConcolicValue), so it does not have access to any potential symbolic information attached.

Another issue with call in finally block

We don't seem to handle the following test correctly:

function f() { return undefined; }
function testcase() {
    try {
        return true;
    } finally {
        f()
    }
}
console.log(testcase());

Replay prints undefined instead of true. Test added as tests/unit/call_in_finally_2.js.

add a "sanity mode" with extra checks for client analysis?

It would be nice to have a mode in which Jalangi does various sanity checks on the result of a client analysis. For example, for the getField() callback, we could check that the analysis only returns undefined if the actual value in the field is undefined. A failed sanity check doesn't necessarily indicate an analysis bug, but it would be a strong indicator. We could implement sanity mode as a wrapper analysis that delegates to the client analysis and adds the checks.

robustify Jalangi to client exceptions

Right now, if the client analysis throws an exception, it's rather unpredictable as to how exactly the program will fail. In fact, it seems quite possible that an exception thrown by an analysis could be caught by the analyzed program, which seems strange. We should aim to wrap calls into the analysis in try-catch blocks, such that we print a proper stack trace (at least when running under node.js) and exit relatively cleanly.

running instrumented code using PhantomJS

I'm trying to run the annex test under PhantomJS. If we could get this working, we could do things like run web tests in a regression suite or automate generation of certain types of traces. The first issue I ran into is that the current PhantomJS websocket support doesn't seem to be up to snuff. So, I hacked up a quick change to keep the trace in memory during record; see the in-memory-trace branch:

https://github.com/SRA-SiliconValley/jalangi/tree/in-memory-trace

I still get some errors on this branch when loading in PhantomJS, though. Steps to reproduce:

Install PhantomJS: see http://phantomjs.org/download.html
Save the following PhantomJS script into file load.js:

var page = require('webpage').create(),
    system = require('system'),
    address;

if (system.args.length === 1) {
    console.log('Usage: loadspeed.js <some URL>');
    phantom.exit();
}

address = system.args[1];
page.onConsoleMessage = function (msg) {
    console.log('Got message ' + msg);
};
page.open(address, function (status) {
    console.log("loaded");
    page.evaluate(function () {
         console.log("hello");
         console.log(J$.trace_output);
      });
});

Check out the in-memory-trace jalangi branch. In analysis.js, line 119, set IN_MEMORY_BROWSER_LOG to inBrowser.
Start the local web server (python scripts/jalangi.py server), and instrument annex as shown in README.md. No need to start the websocket server, though.
Run the following command (assumes phantomjs is in your path):

phantomjs load.js http://127.0.0.1:8000/tests/tizen/annex/index_jalangi_.html

When I run, I get the following output:

TypeError: 'undefined' is not an object (evaluating 'screen.orientation.indexOf')

  http://127.0.0.1:8000/tests/tizen/annex/index_jalangi_.html:24
Got message TypeError: Attempting to change writable attribute of unconfigurable property.
Got message TypeError: Attempting to change writable attribute of unconfigurable property.
    at printableValue (http://127.0.0.1:8000/src/js/analysis.js:1028)
    at http://127.0.0.1:8000/src/js/analysis.js:1410
    at http://127.0.0.1:8000/src/js/analysis.js:1236
    at G (http://127.0.0.1:8000/src/js/analysis.js:541)
    at http://127.0.0.1:8000/tests/tizen/annex/lib/jquery-1.6.2.min_jalangi_.js:1222
    at invokeFun (http://127.0.0.1:8000/src/js/analysis.js:494)
    at http://127.0.0.1:8000/src/js/analysis.js:585
    at http://127.0.0.1:8000/tests/tizen/annex/lib/jquery-1.6.2.min_jalangi_.js:13591
Got message TypeError: 'undefined' is not an object (evaluating 'g.apply')
Got message TypeError: 'undefined' is not an object (evaluating 'g.apply')
    at invokeFun (http://127.0.0.1:8000/src/js/analysis.js:494)
    at http://127.0.0.1:8000/src/js/analysis.js:585
    at http://127.0.0.1:8000/tests/tizen/annex/js/annex_jalangi_.js:1462
loaded
Got message hello
Got message [3,"tests/tizen/annex/lib/jquery-1.6.2.min_jalangi_.js",88149,0,6]
,[4,1,88141,1,17]
,[4,3,17473,3,8]
,[4,5,17481,5,8]
,[3,"tests/tizen/annex/js/annex_jalangi_.js",11097,7,6]
,[4,3,9889,10,17]

The trace is printing, which is good, but there are some other JS errors I can't fully grok. @ksen007, any idea what could be going wrong? PhantomJS is based on Webkit, but not the absolute latest version. This is not super urgent, but if we could get this working sometime, that would be great.

remove large video files from repository

Can we remove the large video files under the paper directory and keep them somewhere else? They probably don't belong in version control. Removing them won't speed up a git clone without more drastic action (since they'll still be in the history), but pulling the tarball of the latest jalangi version will be much faster. @ksen007 maybe we can make a separate repository for the files in the paper directory?

non-unique IIDs in the presence of eval

When instrumenting eval'd code, Jalangi generates instrumented code with IIDs that can conflict with the surrounding instrumented script. As a simple example:

var f = function foo() {
  return eval("3+4+5+6+7");
}
console.log(f());

When the above code is instrumented, IIDs 13, 17, and 21 are used for certain AST nodes. If I print the result of instrumenting the eval'd code during the record phase, I get:

J$.B(18, '+', J$.B(14, '+', J$.B(10, '+', J$.B(6, '+', J$.T(5, 3, 22), J$.T(9, 4, 22)), J$.T(13, 5, 22)), J$.T(17, 6, 22)), J$.T(21, 7, 22));

As you can see, the IIDs appear again. This can cause problems for analyses that rely on IIDs to be unique.

It seems to me that a simple solution would be to read in the generated sourcemap from instrumentation at record / replay time, and start out the IIDs for any eval'd code at a higher number than the highest IID in the sourcemap. @ksen007 what do you think?

preservation of ReferenceErrors when reading undefined global variables

We should think about preserving the ReferenceError that is thrown when reading an undefined variable. Right now, under Jalangi instrumentation, the error is no longer thrown. See existing unit test tests/unit/reference_error.js. We should also handle such errors thrown by a call to eval, e.g.:

var str = "{x: y}";
var indirect = eval;
indirect(str);

(The above is checked in as tests/unit/eval_undefined_var.js.) Not preserving ReferenceErrors is causing a jQuery unit test to fail when jQuery is Jalangi-instrumented.

path deviation loading jQuery 2.0.2 during replay

I'm getting a path deviation during replay for a simple file that just loads jQuery 2.0.2. The example is on the html-tests branch, under tests/html/jquery-2.0.2/justload. If you instrument tests/html/jquery-2.0.2/jquery-2.0.2.js, load tests/html/jquery-2.0.2/justload/index_jalangi_.html in Chrome, and then try to replay, you should see it. I tried to reproduce under node.js and jsdom but unfortunately I didn't see the error there. Not sure how to minimize for this one.