Log Statements Generation via Deep Learning: Widening the Support Provided to Developers

In this work, we aim to close the circle regarding supporting developers during logging activities. In detail, starting from LANCE, the state-of-the-art model we presented at ICSE-22, we further improve the support offered to developers by tackling all the questions left unanswered in the previous work. To this extent, we present LEONID an extension of lance, which can (i) decide whether a given Java method would benefit from the injection of log statements (or if, instead, those are not needed); and (ii) support the injection of multiple log statements if needed, deciding how many statements are needed, where they should be placed, and what they should log. Finally, we also tried to boost the generation of meaningful log messages by combining information retrieval (IR) and DL.

Repositority Structure

Set up a GCS Bucket
- Before starting with everything, you need to prepare a new GCS Bucket. To do so, follow the original guide provided by Google at this link: https://cloud.google.com/storage/docs/quickstart-console
Code
- Pre-Training: The code we used to pre-train from scratch a new T5 model on a bigger dataset is available under this path Code/Pre-Training
- Fine-Tuning: The code we used to fine-tune all the approaches is available at this path Code/Fine-Tuning
- Miscellaneous: Such folder, contains the additional script we used to: (i) train the sentencepiece tokenizer, (ii) to select the best-performing models performing early stopping and (iii) to run the analysis described in the paper.
  Here the link: Code/Misc
  As for the sentencepiece model, our trained tokenizer is available here: https://drive.google.com/drive/folders/1uZk5fTcsErpsRPoH8Gz-5akhqM-0eklU?usp=sharing
Datasets
- All the dataset we built and used are available at this link: https://drive.google.com/drive/folders/1bKaQlcz3W_v8VJqIePPJjqT17pwdJCJ1?usp=sharing
  Please note that, for each training, test and eval set, we share both the TSV file needed to train the T5-based models and also the CSV files containing additional information (e.g, className from which the method has been retrieved, log statements per methods)
Models (All the models we trained are publicly available)
Results: 📂
- Predictions:
- Manual Analysis:
  - LANCE-2.0
  - LEONID Single Log
IMG: 📂
- Examples of semantically equivalent predictions made by LEONID when injecting single-log statements
- Examples of correct predictions made by LEONID when injecting several log statements

zhuzrx / automating-logging-acitivities Goto Github PK

automating-logging-acitivities's Introduction

Log Statements Generation via Deep Learning: Widening the Support Provided to Developers

Repositority Structure

Set up a GCS Bucket

Code

Datasets

Models (All the models we trained are publicly available)

Results: 📂

IMG: 📂

automating-logging-acitivities's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent