Giter Site home page Giter Site logo

reuben3ym / spark-timeseries Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sryza/spark-timeseries

0.0 1.0 0.0 1.95 MB

A library for time series analysis on Apache Spark

License: Apache License 2.0

Scala 75.98% Python 14.82% Makefile 1.00% Java 8.20%

spark-timeseries's Introduction

Build Status

Time Series for Spark (The spark-ts Package)

A Scala / Java / Python library for interacting with time series data on Apache Spark.

Post questions and comments to the Google group, or email them directly to mailto:[email protected].

Note: The spark-ts library is no longer under active development by me (Sandy). I unfortunately no longer have bandwidth to develop features, answer all questions on the mailing list, or fix all bugs that are filed.

That said, I remain happy to review pull requests and do whatever I can to aid others in advancing the library.

Docs are available at http://sryza.github.io/spark-timeseries.

Or check out the Scaladoc, Javadoc, or Python doc.

The aim here is to provide

  • A set of abstractions for manipulating large time series data sets, similar to what's provided for smaller data sets in Pandas, Matlab, and R's zoo and xts packages.
  • Models, tests, and functions that enable dealing with time series from a statistical perspective, similar to what's provided in StatsModels and a variety of Matlab and R packages.

The library sits on a few other excellent Java and Scala libraries.

Using this Repo

Building

We use Maven for building Java / Scala. To compile, run tests, and build jars:

mvn package

To run Python tests (requires nose):

cd python
export SPARK_HOME=<location of local Spark installation>
nosetests

Running

To run a spark-shell with spark-ts and its dependencies on the classpath:

spark-shell --jars target/sparkts-$VERSION-SNAPSHOT-jar-with-dependencies.jar

Releasing

To publish docs, easiest is to clone a separate version of this repo in some location we'll refer to as DOCS_REPO. Then:

# Build main doc
mvn site -Ddependency.locations.enabled=false

# Build scaladoc
mvn scala:doc

# Build javadoc
mvn javadoc:javadoc

# Build Python doc
cd python
export SPARK_HOME=<location of local Spark installation>
export PYTHONPATH=$PYTHONPATH::$SPARK_HOME/python:$SPARK_HOME/python/lib/*
make html
cd ..

cp -r target/site/* $DOCS_REPO
cp -r python/build/html/ $DOCS_REPO/pydoc
cd $DOCS_REPO
git checkout gh-pages
git add -A
git commit -m "Some message that includes the hash of the relevant commit in master"
git push origin gh-pages

To build a Python source distribution, first build with Maven, then:

cp target/sparkts-$VERSION-SNAPSHOT-jar-with-dependencies.jar python/sparkts/
cd python
python setup.py sdist

To release Java/Scala packages (based on http://oryxproject.github.io/oryx/docs/how-to-release.html):

mvn -Darguments="-DskipTests" -DreleaseVersion=$VERSION \
    -DdevelopmentVersion=$VERSION-SNAPSHOT release:prepare

mvn -s private-settings.xml -Darguments="-DskipTests" release:perform

To release Python packages (based on http://peterdowns.com/posts/first-time-with-pypi.html):

python setup.py register -r pypi
python setup.py sdist upload -r pypi

spark-timeseries's People

Contributors

sryza avatar ahmed-mahran avatar josepablocam avatar simonouellette35 avatar cdalzell avatar mbaddar1 avatar pegli avatar rchanda avatar dmsuehir avatar srowen avatar souellette-faimdata avatar polymorphic avatar mbaddar2 avatar gliptak avatar sunheehnus avatar ekote avatar jrachiele avatar lnicalo avatar rcastbergw avatar tiagodinisfonseca avatar wgorman avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.