In this work, we aim to close the circle regarding supporting developers during logging activities. In detail, starting from LANCE, the state-of-the-art model we presented at ICSE-22, we further improve the support offered to developers by tackling all the questions left unanswered in the previous work. To this extent, we present LEONID an extension of lance, which can (i) decide whether a given Java method would benefit from the injection of log statements (or if, instead, those are not needed); and (ii) support the injection of multiple log statements if needed, deciding how many statements are needed, where they should be placed, and what they should log. Finally, we also tried to boost the generation of meaningful log messages by combining information retrieval (IR) and DL.
-
- Before starting with everything, you need to prepare a new GCS Bucket. To do so, follow the original guide provided by Google at this link: https://cloud.google.com/storage/docs/quickstart-console
-
- Pre-Training: The code we used to pre-train from scratch a new T5 model on a bigger dataset is available under this path Code/Pre-Training
- Fine-Tuning: The code we used to fine-tune all the approaches is available at this path Code/Fine-Tuning
- Miscellaneous: Such folder, contains the additional script we used to: (i) train the sentencepiece tokenizer, (ii) to select the best-performing models performing early stopping and (iii) to run the analysis described in the paper.
Here the link: Code/Misc
As for the sentencepiece model, our trained tokenizer is available here: https://drive.google.com/drive/folders/1uZk5fTcsErpsRPoH8Gz-5akhqM-0eklU?usp=sharing
-
- All the dataset we built and used are available at this link: https://drive.google.com/drive/folders/1bKaQlcz3W_v8VJqIePPJjqT17pwdJCJ1?usp=sharing
Please note that, for each training, test and eval set, we share both the TSV file needed to train the T5-based models and also the CSV files containing additional information (e.g, className from which the method has been retrieved, log statements per methods)
- All the dataset we built and used are available at this link: https://drive.google.com/drive/folders/1bKaQlcz3W_v8VJqIePPJjqT17pwdJCJ1?usp=sharing
-
- Predictions:
- Manual Analysis: