Hi, I am currently writing a bachelors project where my aim is to te

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for your answer <a class="user-mention notranslate" data-hovercard-type="user"

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Generating code documentation with code2seq about code2seq HOT 8 CLOSED

balysMorkunas commented on June 24, 2024

Generating code documentation with code2seq

from code2seq.

Comments (8)

urialon commented on June 24, 2024

Hi @balysMorkunas ,
Thank you for your interest in code2seq!

I believe that the paper may give a better intuition than what I can describe in brief: https://openreview.net/pdf?id=H1gKYo09tX

Let me know if you have any specific questions!
Uri

from code2seq.

balysMorkunas commented on June 24, 2024

Thank you, I'll contact again if any serious questions arise!

from code2seq.

balysMorkunas commented on June 24, 2024

Hi again,

I started looking at how to retrain the model and preprocess the dataset for documentation generation. I followed your suggestion on issue #34 where you suggest to change JavaExtractor to output documentation instead of method names.

Could you please elaborate/give example on how to do that? Do you by chance mean to use node.getJavaDoc() instead of node.getName()? What other changes should I be aware of?

Thank you very much for your time and effort, I really appreciate it,
Balys.

from code2seq.

bacevicius commented on June 24, 2024

Hi @urialon, I am in a very similar situation to @balysMorkunas and would also like to hear your input about this question. Thank you for your time!

from code2seq.

urialon commented on June 24, 2024

Hi @balysMorkunas and @bacevicius ,
Thank you for your interest in code2seq!

Do you by chance mean to use node.getJavaDoc() instead of node.getName()

Basically yes!

Another option, if you wish to train on an existing dataset, is to set it to a unique ID, and then replace the unique ID with the documentation later. See also:
#45
For additional scripts and hyperparameters.

Best,
Uri

from code2seq.

balysMorkunas commented on June 24, 2024

Thanks for your answer @urialon !

Do you think that the hyperparameters config.SUBTOKENS_VOCAB_MAX_SIZE = 190000 and config.TARGET_VOCAB_MAX_SIZE = 27000 are enough for documentation generation, or should be increased?
Anything else to watch out for, regarding the hyperparameters, maybe max_code_len and min_code_len in JavaExtractor?

Thank you very much for your time,
Balys.

from code2seq.

urialon commented on June 24, 2024

Hi @balysMorkunas ,
Sorry for the delayed response.

These hyperparameters look OK to me, but they depend on the exact dataset and can never really be known in advance.
max_code_len and min_code_len refer to the size of the functions that you consider, so it is up to the dataset you are working with.

Best,
Uri

from code2seq.

balysMorkunas commented on June 24, 2024

Thanks for your help!

from code2seq.

Generating code documentation with code2seq about code2seq HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent