src-d / code2vec Goto Github PK
View Code? Open in Web Editor NEWMLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)
License: Apache License 2.0
MLonCode community effort to implement Learning Distributed Representations of Code (https://arxiv.org/pdf/1803.09473.pdf)
License: Apache License 2.0
After installing babel fish, and doing some modifications to the source code (e.g., I changed Node
for node
in bblfsh.node
in /structures/extended_node.py
file) I did:
>>> import bblfsh
>>> from algorithms.path_contexts import get_paths
>>> from pprint import pprint
>>> code = bblfsh.BblfshClient("0.0.0.0:9432").parse(filename='/file.java')
>>> get_paths(code.root, 30, 15)
However, I am getting a key error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-20-ffd3713eb283> in <module>
2
3
----> 4 get_paths(code.root, 30, 15)
~/code2vec_2/src/algorithms/path_contexts.py in get_paths(uast, max_length, max_width, token_extractor, leaf_token)
135 u, v = leaves[i], leaves[j]
136 # TODO decide where to filter comments and maybe decouple from bblfsh
--> 137 if not is_noop_line(u) and not is_noop_line(v):
138 ancestor = lca(u, v)
139 d = distance(u, v, ancestor)
~/code2vec_2/src/algorithms/path_contexts.py in is_noop_line(node)
109 # dirty hardcoding to avoid getting paths with comments as start/end
110 def is_noop_line(node):
--> 111 return node.bn.internal_type == 'NoopLine' or node.bn.internal_type == 'SameLineNoops'
112
113
~/anaconda3/envs/sourced/lib/python3.6/site-packages/bblfsh-3.1.1-py3.6-linux-x86_64.egg/bblfsh/node.py in internal_type(self)
194 @property
195 def internal_type(self) -> str:
--> 196 return self.get_dict()["@type"]
197
198 @internal_type.setter
KeyError: '@type'
What is the correct way of using get_paths
function? If my purpose is to get the list of paths context like (u, path, v)
?
Issue Trying to load a Code2Vec_Features modelforge model does not work (traceback of the error at the end).
Model data
Code2Vec_Features:
value2index: dict[string] = int
path2index: dict[tuple] = int
value2freq: dict[string] = int
path2freq: dict[tuple] = int
path_contexts: list[ (string1, list1), (string2, list2) ...]
The model seems to be correctly saved, i.e., I've checked that all parameters of the tree passed to the asdf write asdf.AsdfFile(final_tree).write_to(file)
are immutable (dictionary with tuples as keys).
What I've tried just in case
Error origin
The error appears in yaml/constructor.py
file. The reason appears to be that the generator function construct_yaml_seq
yields an empty list which is then associated to a node and then used as key. I don't know if this is a yaml bug or expected behavior. I'm assuming that it is not a bug but further debugging to find the reason requires more detailed inspection of the yaml setup etc... which I do not have time to do now.
Traceback
Traceback (most recent call last):
File "/home/hydra/projects/code2vec/scripts/../src/__main__.py", line 68, in <module>
sys.exit(main())
File "/home/hydra/projects/code2vec/scripts/../src/__main__.py", line 64, in main
return handler(args)
File "/usr/local/lib/python3.6/dist-packages/sourced/ml/utils/engine.py", line 87, in wrapped_pause
return func(cmdline_args, *args, **kwargs)
File "/home/hydra/projects/code2vec/src/cmd/code2vec_train.py", line 18, in code2vec_train
model = Code2VecFeatures().load(args.input)
File "/usr/local/lib/python3.6/dist-packages/modelforge-0.6.1-py3.6.egg/modelforge/model.py", line 106, in load
with asdf.open(source) as model:
File "/usr/local/lib/python3.6/dist-packages/asdf/asdf.py", line 766, in open
ignore_missing_extensions=ignore_missing_extensions)
File "/usr/local/lib/python3.6/dist-packages/asdf/asdf.py", line 678, in _open_impl
ignore_missing_extensions=ignore_missing_extensions)
File "/usr/local/lib/python3.6/dist-packages/asdf/asdf.py", line 613, in _open_asdf
tree = yamlutil.load_tree(reader, self, self._ignore_version_mismatch)
File "/usr/local/lib/python3.6/dist-packages/asdf/yamlutil.py", line 295, in load_tree
return yaml.load(stream, Loader=AsdfLoaderTmp)
File "/usr/local/lib/python3.6/dist-packages/yaml/__init__.py", line 72, in load
return loader.get_single_data()
File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 37, in get_single_data
return self.construct_document(node)
File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 46, in construct_document
for dummy in generator:
File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 398, in construct_yaml_map
value = self.construct_mapping(node)
File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 204, in construct_mapping
return super().construct_mapping(node, deep=deep)
File "/usr/local/lib/python3.6/dist-packages/yaml/constructor.py", line 128, in construct_mapping
"found unhashable key", key_node.start_mark)
yaml.constructor.ConstructorError: while constructing a mapping
in "<file>", line 18, column 3
found unhashable key
in "<file>", line 18, column 5
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.