Giter Site home page Giter Site logo

pratham567 / ectsum Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rajdeep345/ectsum

0.0 1.0 0.0 21.71 MB

Dataset and Codes for our EMNLP 2022 Main Conference Long Paper titled "ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts"

Python 100.00%

ectsum's Introduction

ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts

Long Paper Accepted at the EMNLP 2022 Main Conference!

  • Paper: https://aclanthology.org/2022.emnlp-main.748/
  • Poster: https://rajdeep345.github.io/files/pdf/research/ECTSum_EMNLP2022_Poster.pdf
  • Pre-recorded Video: https://drive.google.com/file/d/1DW2i2ApgiE6V7ViiayX5zdJSRXdAEbsy/view
  • Dataset

    The ECTSum dataset can be found under the data folder.

    Codes

    Codes and instructions for our proposed model ECT-BPS can be found under codes/ECT-BPS
    Codes and instructions for our baseline models can be found under codes/baselines

    Data Preparation for ECT-BPS

    Preparing the data for training the Extractive Module

    Imports

    pip install sentence-transformers
    pip install num2words
    pip install word2number

    Prepare the data

    python prepare_data_ectbps_ext.py

    Data Location

    The data is saved at codes/ECT-BPS/ectbps_ext/data/.
    Processed data is already uploaded at this location.

    Preparing the data for training the Paraphrasing Module

    Imports

    pip install sentence-transformers
    pip install num2words
    pip install word2number

    Prepare the data

    python prepare_data_ectbps_para.py

    Data Location

    The data is saved at codes/ECT-BPS/ectbps_para/data/para/.
    Processed data is already uploaded at this location.

    Prepare the data with numericals masked

    python prepare_data_ectbps_para_mask.py

    Data Location

    The data is saved at codes/ECT-BPS/ectbps_para/data/para_mask/.
    Processed data is already uploaded at this location.

    Updates

  • 1st November 2022 - ECTSum Dataset released
  • 30th November 2022 - Codes and Instructions released for training the Extractive Module of ECT-BPS
  • 3rd March 2023 - Added the Prediction Pipeline for the Extractive module.
  • 5th March 2023 - Codes released to prepare the data for training the Paraphrasing Module
  • 7th March 2023 - Codes released to train the Paraphrasing Module of ECT-BPS
  • 8th March 2023 - Google Colab Notebook released for training and testing the Paraphrasing Module
  • ectsum's People

    Contributors

    rajdeep345 avatar abhinav-bohra avatar

    Watchers

     avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google โค๏ธ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.