Comments (12)
it works! it seems that I lost *.jar file and I found it back ,thanks for helping!
from code2seq.
I modified a little of java large data set but I did not rewrite anything for code2seq, could you please help me about my issue? Thanks a lot!
from code2seq.
Hi @lizhuo-1994 ,
Thank you for your interest in code2seq!
What is your Java version? Please run "java --version"
from code2seq.
Additionally - can you try to run the extractor directly, without the python wrapper:
java -cp JavaExtractor/JPredict/target/JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir JavaExtractor/JPredict/src/main
from code2seq.
$ java -version
openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~16.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)
$ java -cp JavaExtractor/JPredict/target/JavaExtractor-0.0.1-SNAPSHOT.jar JavaExtractor.App --max_path_length 8 --max_path_width 2 --dir JavaExtractor/JPredict/src/main
Error: Could not find or load main class JavaExtractor.App
here is my result, thanks for helping~
from code2seq.
Did you run this from the main code2seq directory? Does the jar file exist?
Can you please run:
ls -lt JavaExtractor/JPredict/target/JavaExtractor-0.0.1-SNAPSHOT.jar
?
If the file exists, then please run:
jar tvf JavaExtractor/JPredict/target/JavaExtractor-0.0.1-SNAPSHOT.jar | grep JavaExtractor
from code2seq.
but here is another problem:
Extracting paths from validation set...
Finished extracting paths from validation set
Extracting paths from test set...
Finished extracting paths from test set
Extracting paths from training set...
dir: data/train was not completed in time
Finished extracting paths from training set
Creating histograms from the training data
subtoken vocab size: 0
node vocab size: 0
target vocab size: 0
File: 1.test.raw.txt
Traceback (most recent call last):
File "preprocess.py", line 115, in
max_contexts=int(args.max_contexts), max_data_contexts=int(args.max_data_contexts))
File "preprocess.py", line 53, in process_file
print('Average total contexts: ' + str(float(sum_total) / total))
ZeroDivisionError: float division by zero
from code2seq.
maybe it is because of timeout , I will try it again, thanks ~
from code2seq.
Yes, there are timeouts, and we originally used a 64-cores machine to preprocess the datasets.
So using a smaller machine might trigger timeouts.
The exact time is defined here:
https://github.com/tech-srl/code2seq/blob/master/JavaExtractor/extract.py#L37
By default, 6 processes run in parallel (see: https://github.com/tech-srl/code2seq/blob/master/JavaExtractor/extract.py#L66 and each of them runs with 64 threads (see: https://github.com/tech-srl/code2seq/blob/master/preprocess.sh#L32)
To verify that preprocessing runs on a small dataset, you can try preprocessing the JavaExtractor itself. I.e., point the training+test+validation paths to JavaExtractor/JPredict/src/
and verify that it runs successfully within a few seconds or so.
from code2seq.
thanks for the explanation, I re-configured it and now it seems working well.
BTW, it is really disk-consuming and time-consuming, so I think it would be running about 2-3days for preprocessing
from code2seq.
Unfortunately, that's right.
The preprocessing pipeline was designed to process millions of examples and it is disk- and time- consuming.
I'm closing this issue for now, feel free to re-open if you have any additional question.
from code2seq.
thanks for the explanation, I re-configured it and now it seems working well.
BTW, it is really disk-consuming and time-consuming, so I think it would be running about 2-3days for preprocessing
Hello, may I ask the specific configuration of your machine and the last parameter you used?
Thanks a lot!
from code2seq.
Related Issues (20)
- Generating embeddings for Python and Java HOT 5
- Help with implementing local service with JavaExtractor HOT 10
- I can not preprocess Python dataset
- Error running prediction on Code2seq released model
- I got Out of Memory Error during Training
- Unable to get embeddings from the trained model for Java
- Extract Path Contexts Only HOT 5
- InvalidArgumentError in sess.run() HOT 3
- Visualize Python AST HOT 2
- Extract java files HOT 2
- Getting "was not completed in time" error when preprocessing dataset HOT 11
- code2seq for Python HOT 3
- Error processing property '_dropout_mask_cache' of <ContextValueCache> HOT 6
- Sampling k paths from AST tree HOT 11
- I am getting TimeError while using code2seq to predict long method HOT 2
- Generating code documentation with code2seq HOT 8
- Tensorflow out-of-bound error while trying to train the Code2Seq model on our own python dataset HOT 6
- Model is predicting empty string for custom python dataset HOT 8
- Exporting code vectors HOT 6
- Encountered error of preprocess data HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from code2seq.