Giter Site home page Giter Site logo

openshift-spark's Introduction

DISCONTINUATION OF PROJECT.

This project will no longer be maintained by Intel.

This project has been identified as having known security escapes.

Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.

Intel no longer accepts patches to this project. Build status Docker build Layers info

Apache Spark images for OpenShift

This repository contains several files for building Apache Spark focused container images, targeted for usage on OpenShift Origin.

By default, it will build the following images into your local Docker registry:

  • openshift-spark, Apache Spark, Python 2.7
  • openshift-spark-py36, Apache Spark, Python 3.6

For Spark versions, please see the image.yaml file.

Instructions

Build

Prerequisites

Procedure

Create all images and save them in the local Docker registry.

make

Push

Tag and push the images to the designated reference.

make push SPARK_IMAGE=[REGISTRY_HOST[:REGISTRY_PORT]/]NAME[:TAG]

Customization

There are several ways to customize the construction and build process. This project uses the GNU Make tool for the build workflow, see the Makefile for more information. For container specification and construction, the Container image creation tool concreate is used as the primary point of investigation, see the image.yaml file for more information.

Partial images without an Apache Spark distribution installed

This repository also supports building 'incomplete' versions of the images which contain tooling for OpenShift but lack an actual Spark distribution. An s2i workflow can be used with these partial images to install a Spark distribution of a user's choosing. This gives users an alternative to checking out the repository and modifying build files if they want to run a custom Spark distribution. By default, the partial images built will be

  • openshift-spark-inc, Apache Spark, Python 2.7
  • openshift-spark-inc-py36, Apache Spark, Python 3.6

Build

To build the partial images, use make with Makefile.inc

make -f Makefile.inc

Push

Tag and push the images to the designated reference.

make -f Makefile.inc push SPARK_IMAGE=[REGISTRY_HOST[:REGISTRY_PORT]/]NAME[:TAG]

Image Completion

To produce a final image, a source-to-image build must be performed which takes a Spark distribution as input. This can be done in OpenShift or locally using the s2i tool if it's installed. The final images created can be used just like the openshfit-spark and openshift-spark-py36 images described above.

Build inputs

The OpenShift method can take either local files or a URL as build input. For the s2i method, local files are required. Here is an example which downloads an Apache Spark distribution to a local 'build-input' directory (including the sha512 file is optional).

$ mkdir build-input
$ wget https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz -O build-input/spark-2.4.0-bin-hadoop2.7.tgz
$ wget https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz.sha512 -O build-input/spark-2.4.0-bin-hadoop2.7.tgz.sha512

Optionally, your build-input directory may contain a modify-spark directory. The structure of this directory should be parallel to the structure of the top-level directory in the Spark distribution tarball. During the installation, the contents of this directory will be copied to the Spark installation using rsync, allowing you to add or overwrite files. To add my.jar to Spark, for example, put it in build-input/modify-spark/jars/my.jar

Running the image completion

To complete the Python 2.7 image using the s2i tool

$ s2i build build-input radanalyticsio/openshift-spark-inc openshift-spark

To complete the Python 2.7 image using OpenShift, for example:

$ oc new-build --name=openshift-spark --docker-image=radanalyticsio/openshift-spark-inc --binary
$ oc start-build openshift-spark --from-file=https://archive.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz

Note that the value of `--from-file` could also be the `build-input` directory from the s2i example above.

This will write the completed image to an imagestream called openshift-spark in the current project

A 'usage' command for all images

Note that all of the images described here will respond to a 'usage' command for reference. For example

$ docker run --rm openshift-spark:latest usage

openshift-spark's People

Contributors

dfeddema avatar elmiko avatar jayunit100 avatar jkremser avatar mattf avatar matzew avatar opecheese avatar rdower avatar rebeccasimmonds19 avatar rnowling avatar soltysh avatar sub-mod avatar tmckayus avatar willb avatar

Watchers

 avatar  avatar

Forkers

isabella232

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.