Giter Site home page Giter Site logo

lprtk / nlp-amazon-customer-reviews Goto Github PK

View Code? Open in Web Editor NEW
5.0 1.0 2.0 6.66 MB

Sentiment analysis, text mining, topic modeling & sentiment prediction

Jupyter Notebook 100.00%
data-visualization machine-learning nlp python sentiment-analysis sentiment-classification sentiment-scores text-cleaning text-mining topic-modeling web-scraping

nlp-amazon-customer-reviews's Introduction

NLP: sentiment analysis, text mining, topic modeling & sentiment prediction

GitHub issues GitHub forks Github Stars Code style: black

MAP_i94

Table of contents

Content

This project focuses on why and how we can extract information and value from large volumes of textual data using Natural Language Processing (NLP). Unstructured data is an interesting challenge for any data scientist. The usual rules and conventions of data preparation don't work, it is necessary to perform a thorough text mining in order to bring out all the information hidden behind each word of a text.

The interesting thing about NLP and textual data is that it's everywhere. So it is possible to use NLP methods in many different application areas. Here, we will use NLP in a marketing case to better understand our customers and improve their user experience (UX). To do this, we will pretend to be a seller using Amazon as a platform and will work on consumer reviews that we have scraped. For example, we take each review and parse them to retrieve the pseudo, title and content of the review, the rating given, whether the purchase is verified, the date or even the place where the review is posted.

The objective is to put ourselves in the shoes of a brand that markets its products on Amazon and use NLP to improve the overall customer experience, evaluate our service on Amazon and our image, assess customer satisfaction differently, improve products based on customer reviews or be more available and accessible to customers. To do this, we will use many concepts and methods of Data Science applied to textual data. Our application approach is presented in 5 main streams:

  • Step 1 : Web Scraping
    • Collect and create the data schema.
    • Parsing customer reviews to enrich the database: extracting title, description, date, time, nickname and rating.
  • Step 2: Sentiment Analysis and Scoring
    • Understand and probe the satisfaction of each customer.
    • Scoring the intensity and polarity of feelings from the review description.
  • Step 3: Text mining and data cleaning
    • Text cleaning adapted to the sales domain and to the general content of reviews.
  • Step 4: Topic Modeling (unsupervised learning)
    • To improve availability and speed up response time, reviews can be disassociated and prioritized according to the topic they address.
  • Step 5: Machine Learning (supervised learning)
    • Without reading future reviews, design a robust model to identify the overall sentiment expressed by the customer.

Requirements

  • Python version 3.9.7

File details

  • data
    • This folder contains the data.
  • powerpoint
    • This folder contains a powerpoint presentation in French of the project.
  • scraping
    • This folder contains a .ipynb file which contains the code for data scraping on Amzon web site.
  • preprocessing
    • This folder contains a .ipynb file which contains the code for text cleaning, sentiment analysis and text mining.
  • visualization
    • This folder contains a .ipynb file which contains the code for data visualization.
  • model
    • This folder contains two .ipynb files which contains the code for machine learning model (unsupervised and supervised learning).

Here is the project pattern:

- project 
    > nlp-amazon-customer-reviews
        > image 
            - topic_modeling.png
        > scraping 
            - web_scraping.ipynb
        > preprocessing
            - text_mining.ipynb
        > visualization 
            - data_visualization.ipynb
        > modeling 
            - topic_modeling.ipynb
            - machine_learning.ipynb
        > data 
            - amzn_customer_reviews.csv
        > powerpoint 
            - ppt_project_fr.pdf

Features

My profil โ€ข My GitHub

nlp-amazon-customer-reviews's People

Contributors

lprtk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.