Giter Site home page Giter Site logo

rspark-tutorial's Introduction

rspark-tutorial

rspark-tutorial provides content illustrating the use of rspark, a Docker-based computing environment. rspark runs R, RStudio, PostgreSQL, Hadoop, Hive, and Spark in containers on your local computer (or optionally on AWS). The content covers a range of topics including many aspects of the tidyverse, machine learning using Spark, etc.

This tutorial is meant to run in conjunction with rspark-docker, which contains images of the rspark components. The steps for installing and launching the rspark-docker containers is given here:

https://github.com/jharner/rspark-docker

Downloading the `rspark-tutorial content

To get access to the tutorial content within rsaprk-docker, do the following:

  1. Make sure git is installed and up to date. This step should have been done when you cloned rspark-docker, but from the command line you can run:

git version

Compare the returned version with the current version. See the directions in the rspark-docker README for installing or updating, if necessary.

  1. Issue the following command in a terminal to clone rspark-tutorial:

git clone https://github.com/jharner/rspark-tutorial.git

This will create a directory (folder) in your home directory by default containing the local git repo of rspark-tutorial. If you prefer installing rspark-tutorial in another directory, cd there first.

Uploading the rspark-tutorial content to rspark-docker

Assuming that rspark-docker has been started and that you have logged into the RStudio container:

  1. Zip the rspark-tutorial folder on your local computer.

  2. Click on the Files tab in the RStudio container and then click the Upload menu item. Navigate to the zipped version of rspark-tutorial and upload.

  3. Execute the .Rmd files in the various modules/sections to generate R notebooks or R markdown output files.

Once you have cloned rspark-tutorial, you will be able to pull updates by issuing the following command from within the rspark-tutorial directory (folder):

cd rspark-docker
git pull origin master

The git pulls will keep your tutorial up to date.

If you want an excellent Git GUI, then download and install: Sourcetree.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.