Giter Site home page Giter Site logo

bhagyasree2895 / wm-final-project-bhagyasree2895 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from 44520-s19/wm-final-project-bhagyasree2895

0.0 1.0 0.0 11.25 MB

wm-final-project-bhagyasree2895 created by GitHub Classroom

Jupyter Notebook 99.97% Python 0.03%

wm-final-project-bhagyasree2895's Introduction

Web Mining Final Project

Name: Bhagya Sree Chanda

Final Project on WhatsApp Spyware Attack In Times Of India and The Hindu By Bhagya Sree Chanda

Question

How different are two newspapers discussing a specific news, Which Newspaper is best to read?

Approach:

The main objective of this project is to analyze how both Newspapers(Times Of India and The Hindu) reacted to major incident on WhatsApp. Also to analyze the content of these two Newspapers addressing the matter.I want to see which accounts are having good lexical diversity, word occurrences, sentiment analysis, generate word cloud.

Data Collection

I have web scraped and collected the data from the online Newspapers of Times of India and The Hindu. here are the links to those:

Libraries Required

  1. beautifulSoup
  2. pandas
  3. requests
  4. matplotlib
  5. nltk
  6. numpy
  7. wordcloud
  8. PIL
  9. urllib
  10. random

Procedure

  • Web Scraping websites
  • Getting the data.
  • Performing analysis: Lexical Diversity, word occurrences, sentiment analysis, common words, generate word cloud
  • Visualizing the results using Matplotlib.

Conculsion

  • From the graph of Lexical Diverity Analysis, we can clearly state that "The Hindu" is a Newspaper with higher LD than "Times Of India" Newspaper.
  • We can also see that the "The Hindu" has more unique words than the "Times Of India" Newspaper.
  • From the above Sentimental Analysis, I can conclude that even if a Newspaper("The Hindu") is having more unique words, high lexical diversity too but it also has more Negative words, Aggressive terms in the Newspaper than the "Times of India".
  • It indicates "The Hindu" has more negative compound than the "Times of India"
  • "The Hindu" has more uniqueness in words than "Times of India". Hence, reading "The Hindu" can help people in learning more new different words.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.