Giter Site home page Giter Site logo

spark-sentimental-analysis-test's Introduction

Exploring key features of Apache Spark using Sentimental Analysis (Naive Bayes Classification)

Apache Spark is an open-source cluster-computing framework. It is much better than the other options available at present.This is an attempt to test Apache Spark's key features with other setups. The features of Apache Spark that we are interested in are In-memory data abstraction, Partial DAG, Lineage base fault recovery, Data co-partition, Unification of Streaming, Batch and Interactive Processing and Hybrid Storage Architecture.

Getting Started

You will need to have Apache Spark installed on your system to run Scala file. You can download it from here. To get a better guide to install it on ubuntu, refer here You will need R Studio and R installed to run the R code. To install R studio refer here

checknb.scala is the Scala code for Naive Bayes Classification.

finalnb.r is the R code for Naive Bayes Classification.

checktrain.txt is the training dataset.

checktest.txt is the testing dataset.

Prerequisites

Things you need to install to run the code on Spark:

JAVA
Scala

Things you need to install to run the R code:

  1. READR
install.packages("readr", INSTALL_opts = c('--no-lock'))
  1. STRINGR
install.packages("stringr", INSTALL_opts = c('--no-lock'))
  1. TOKENIZERS
install.packages("tokenizers", INSTALL_opts = c('--no-lock'))

Run the above three lines on R-Studio console.

Authors

spark-sentimental-analysis-test's People

Contributors

syed-afsahul avatar

Stargazers

 avatar

Watchers

James Cloos avatar Lemon avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.