Giter Site home page Giter Site logo

jzfsdev / learning-spark-with-java Goto Github PK

View Code? Open in Web Editor NEW

This project forked from spirom/learning-spark-with-java

0.0 3.0 0.0 68 KB

Self-contained examples using Apache Spark with the functional features of Java 8

License: MIT License

Java 100.00%

learning-spark-with-java's Introduction

Learning Spark with Java

This project contains snippets of Java code for illustrating various Apache Spark concepts. It is intended to help you get started with learning Apache Spark (as a Java programmer) by providing a super easy on-ramp that doesn't involve cluster configuration, building from sources or installing Spark or Hadoop. Many of these activities will be necessary later in your learning experience, after you've used these examples to achieve basic familiarity.

The project is intended to accompany a number of posts on the blog A River of Bytes.

The basic approach used in this project is to create multiple small, free-standing example programs that each illustrate an aspect fo Spark usage, and to use code comments to explain as many details as seems useful to beginning Spark programmers.

Dependencies

The project is based on Apache Spark 2.2.0 and Java 8.

Warning: In Spark 2.2, support for Java 7 is finally gone. This is documented in the Spark 2.2.0 release notes, but alas not in the corresponding JIRA ticket -- Spark 19493.

Related projects

This project is derived from the LearningSpark project which had the same goals but for Scala programmers. In that project you can also find the early Java 7 examples that gave rise to this project: A lot of Spark programming is a lot less painful in Java 8 than in Java 7.

The spark-streaming-with-kafka project is based on Spark's Scala APIs and illustrates the use of Spark with Apache Kafka, using a similar approach: small free-standing example programs.

The spark-data-sources project is focused on the new experimental APIs introduced in Spark 2.3.0 for developing adapters for external data sources of various kinds. This API is essentially a Java API (developed in Java) to avoid forcing developers to adopt Scala for their data source adapters. Consequently, the example data sources in this project are written in Java, but both Java and Scala usage examples are provided.

Contents

Package What's Illustrated
rdd The JavaRDD: core Spark data structure -- see the local README.md in that directory for details.
pairs A special RDD for the common case of pairs of values -- see the local README.md in that directory for details.
dataset A range of Dataset examples (queryable collection that is statically typed) -- see the local README.md in that directory for details.
dataframe A range of DataFrame/Dataset examples (queryable collection that is dynamically typed) -- see the local README.md in that directory for details.
streaming A range of streaming examples -- see the local README.md in that directory for details.

learning-spark-with-java's People

Contributors

spirom avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.