Giter Site home page Giter Site logo

salento's People

Contributors

asingh-gt avatar cogumbreiro avatar vineethk avatar vm4422 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

salento's Issues

Port Salento to Tensorflow the 1.0 API

Salento is currently stuck in tensforflow 0.12. One important maintenance milestone is to bring the API up to date with the latest version, 1.4 as of now.

I am currently doing this effort.

How to interpret Salento's output?

Any help trying to interpret Salento's output? How do I know what's unlikely?

For instance, I've plotted the output of sequence_aggregator.py as a scatter graph:

Plot of likelyhood

Any idea what to make of this?

Change the architecture of salento to one that is feasible to streaming

Salento expects as an input a sequence of packages.
The problem is that the file format that contains the sequence of packages is a JSON objects, which means that all packages must fit into memory to read them. We currently have some use cases where the datasets do not fit memory, so this architecture is a bottleneck for scalability.

We need to:

  1. change the file format to something amenable to streaming packages
  2. change the internals (say, train.py) such that data is loaded lazily and use as much as possible generators (versus creating lists upfront)

Moving the android driver to its own repository

As far as I understand, the same code extractors can be used for multiple tools (salento and bayou).

Maybe it makes more sense to move code extractors to their own repository?

This would simplify repository maintenance and packaging.

Salento can't handle unknown vocabs

The problem appears to be that Salento's internals are not expecting unknown vocabs. I am wondering if we should just filter out unknown vocabs when ranging through, say Aggregator.events.

@vijay-murali, thoughts?

I'm getting this error when running the sequence aggregator:

Package 1----
Traceback (most recent call last):
  File "/home/tgc/salento/src/main/python/salento/aggregators/sequence_aggregator.py", line 52, in <module>
    aggregator.run()
  File "/home/tgc/salento/src/main/python/salento/aggregators/sequence_aggregator.py", line 38, in run
    llh += math.log(self.distribution_next_call(spec, events[:i], call=self.call(event)))
  File "/home/tgc/salento/src/main/python/salento/aggregators/base.py", line 73, in distribution_next_call
    return dist if call is None else dist[call]
KeyError: 'cogl_pipeline_set_layer_filters'

Salento crashing with `states` defined

Traceback (most recent call last):
  File "/home/tgc/salento/src/main/python/salento/aggregators/kld_aggregator.py", line 94, in <module>
    aggregator.run()
  File "/home/tgc/salento/src/main/python/salento/aggregators/kld_aggregator.py", line 81, in run
    kld_score = self.compute_kld(spec, seqs_l)
  File "/home/tgc/salento/src/main/python/salento/aggregators/kld_aggregator.py", line 59, in compute_kld
    log_q = self.log_likelihood(spec, sequence)
  File "/home/tgc/salento/src/main/python/salento/aggregators/kld_aggregator.py", line 44, in log_likelihood
    llh += math.log(self.distribution_next_state(spec, events[:i] + [partial_event], state=state))
  File "/home/tgc/salento/src/main/python/salento/aggregators/base.py", line 91, in distribution_next_state
    return dist[state]
KeyError: '4#5'

Exception running `kld.py`

Hi, @vijay-murali,

I am trying to debug the error below and for that I was looking at the implementation of kld.py.

Error

tarted at 2017-11-27 16:42:05.398390
2017-11-27 16:42:05.398588: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
### foo.c
Traceback (most recent call last):
  File "../salento/statistical/kld.py", line 150, in <module>
    main()
  File "../salento/statistical/kld.py", line 65, in main
    klds = [(l, kld.compute(l, pack)) for l in locations]
  File "../salento/statistical/kld.py", line 65, in <listcomp>
    klds = [(l, kld.compute(l, pack)) for l in locations]
  File "../salento/statistical/kld.py", line 131, in compute
    samples = [sample(seqs_l, nsamples=1) for i in range(self.args.num_iters)]
  File "../salento/statistical/kld.py", line 131, in <listcomp>
    samples = [sample(seqs_l, nsamples=1) for i in range(self.args.num_iters)]
  File "/home/tiago/Work/salento/statistical/utils.py", line 20, in sample
    samples = [random.choice(s) for i in range(nsamples)] if nsamples > 1 else random.choice(s)
  File "/usr/lib/python3.6/random.py", line 257, in choice
    raise IndexError('Cannot choose from an empty sequence') from None
IndexError: Cannot choose from an empty sequence

Input

{"packages": [
    {"data": [
        {"sequence": [
            {
              "call": "pthread_mutex_lock",
              "states": [],
              "location": "foo.c:2"
            },
            {
              "call": "pthread_mutex_unlock",
              "states": [],
              "location": "foo.c:1"
            }
         ]}
    ],
    "name": "foo.c"
    }
]}

Walk through

In function main() we find the following code:

        for pack in parser.packages:
            locations = parser.locations(pack)
            # ...
            klds = [(l, kld.compute(l, pack)) for l in locations]

For this input we get that there is only one package, where locations = ['foo.c:1', 'foo.c:2'].

Then we have a call to compute(self, l, pack), where in the first line we can find:

        seqs_l = self.parser.sequences(pack, l)

According to the documentation of sequences:

If location is given, then get all sequences in package that end at location.`

Hence, for foo.c:1 we get the only sequence in the input and for foo.c:2 we get seqs_l = [] which then triggers the error.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.