Giter Site home page Giter Site logo

text-summarizer's Introduction

End-to-End Text Summarization Project

Introduction

The "End-to-End Text Summarization Project" is a comprehensive initiative focused on developing an NLP-based text summarizer. The project entails creating a robust pipeline that encompasses data ingestion, validation, transformation, model training, and evaluation. Using this pipeline, an app will be developed and deployed on AWS using a CI/CD workflow. The main goal of the project is to leverage this pipeline to automatically generate concise and accurate summaries from lengthy textual content, providing users with a dependable and efficient solution for text summarization.

Workflow of the project

  • Create all the needed files and folders using template.py

  • Create a new virtual environment

  • Install packages using requirements.txt

  • Set up project using setup.py (automatically configured)

  • Update src/constants/init.py

  • Update src/utils/logger.py

  • Update src/utils/exception.py

  • Update src/utils/utils.py

  • Test project code using notebook

  • for each component in components:

    1. Test component code using notebook
    2. Update config.yaml
    3. Update params.yaml
    4. Update entity/init.py
    5. Update src/config/config.py
    6. Update src/components/component.py
    7. Update src/pipeline/stage_component.py
    8. Update main.py
  • Update src/pipeline/prediction.py

  • Update app.py

  • Update Dockerfile

  • Update .github/workflows/main.yaml

  • Create App

  • Deploy App

AWS CICD deployment with Github

1. Login to AWS console

2. Create IAM user for deployment

  • Create a new user with the following policies:

    1. AmazonEC2ContainerRegistryFullAccess
    2. AmazonEC2FullAccess
  • Create and save the security credentials

3. Create ECR repo to store/save docker image

  • Save the URL of ECR

4. Create EC2 virtual machine (Ubuntu)

5. Open EC2 and install docker in EC2 virtual Machine:

  • optinal

    • sudo apt-get update -y
    • sudo apt-get upgrade
  • required

    • curl -fsSL https://get.docker.com -o get-docker.sh
    • sudo sh get-docker.sh
    • sudo usermod -aG docker ubuntu
    • newgrp docker

6. Configure EC2 as self-hosted runner:

  • github> setting> actions> runners> new self-hosted runner> choose os> run command one by one on EC2

  • check the status of runners: idle -> connected

7. Setup github secrets

  • github> setting> secrets and variables> actions> new repository secret> create the following parameters
    • AWS_ACCESS_KEY_ID
    • AWS_SECRET_ACCESS_KEY
    • AWS_REGION
    • AWS_ECR_LOGIN_URI
    • ECR_REPOSITORY_NAME

8. Add the correct port to EC2

text-summarizer's People

Contributors

jjjjjooooo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.