Comments (6)
Sweet! I found that I dumped max_contexts
rather than max_data_contexts
into dictionary. Now it is successfully reading the data.
Thank you very much.
from code2seq.
Just curious about what results in the error
from code2seq.
Hi @JiyangZhang ,
Thank you for your interest in code2seq!
I am aware that sometimes the reader tries to read a batch, and 10 examples are not even enough for a single batch.
But I think that in your case, the error is not even related to the number of examples, but to the number of paths ("contexts") in each example. The reader expects 201 fields, which are 1 target sequence and 200 contexts. Did you preprocess your data with other numbers than the defaults for MAX_CONTEXTS
and MAX_DATA_CONTEXTS
?
from code2seq.
Hi @urialon ,
Thanks for the reply and that makes sense. But I used the default value of MAX_CONTEXTS
, which is 200 and MAX_DATA_CONTEXTS
, which is 1000. I checked my 'data.train.c2s', every context is padded to length of 1000. I am not sure what is the reason.
I still think the reason is the dict, because it works whenever I use your dict. The way I create dict is just dumping three histograms and the numbers (same with preprocess.py).
Thank you very much!
from code2seq.
Let's compare the dict files.
Please run the following code separately on the two dictionary files, changing DICT_FILE_PATH
every time:
with open(DICT_FILE_PATH, 'rb') as file:
subtoken_to_count = pickle.load(file)
node_to_count = pickle.load(file)
target_to_count = pickle.load(file)
max_contexts = pickle.load(file)
self.num_training_examples = pickle.load(file)
print('Dictionaries loaded.')
what is max_contexts
in each case?
from code2seq.
Also note that self.config.DATA_NUM_CONTEXTS
needs to be 0
such that the value of max_contexts
from the dict file will be loaded:
https://github.com/tech-srl/code2seq/blob/master/model.py#L42
from code2seq.
Related Issues (20)
- Generating embeddings for Python and Java HOT 5
- Help with implementing local service with JavaExtractor HOT 10
- I can not preprocess Python dataset
- Error running prediction on Code2seq released model
- I got Out of Memory Error during Training
- Unable to get embeddings from the trained model for Java
- Extract Path Contexts Only HOT 5
- InvalidArgumentError in sess.run() HOT 3
- Visualize Python AST HOT 2
- Extract java files HOT 2
- Getting "was not completed in time" error when preprocessing dataset HOT 11
- code2seq for Python HOT 3
- Error processing property '_dropout_mask_cache' of <ContextValueCache> HOT 6
- Sampling k paths from AST tree HOT 11
- I am getting TimeError while using code2seq to predict long method HOT 2
- Generating code documentation with code2seq HOT 8
- Tensorflow out-of-bound error while trying to train the Code2Seq model on our own python dataset HOT 6
- Model is predicting empty string for custom python dataset HOT 8
- Exporting code vectors HOT 6
- Encountered error of preprocess data HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from code2seq.