Giter Site home page Giter Site logo

namma / ddf Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ddf-project/ddf

0.0 2.0 0.0 5.24 MB

Distributed DataFrame: Productivity = Power x Simplicity For Big Data Scientists & Engineers

Home Page: http://ddf.io

License: Apache License 2.0

R 5.59% Shell 10.05% Java 56.75% Scala 26.21% Python 1.22% CSS 0.18%

ddf's Introduction

DDF

Distributed DataFrame: Productivity = Power x Simplicity For Big Data Scientists & Engineers

DDF - Distributed DataFrame

DDF aims to make Big Data easy yet powerful, by bringing together the best ideas from R Data Science, RDBMS/SQL, and Big Data distributed processing.

It exposes high-level abstractions like RDBMS tables, SQL queries, data cleansing and transformations, machine-learning algorithms, even collaboration and authentication, etc., while hiding all the complexities of parallel distributed processing and data handling.

DDF is a general abstraction that can be implemented on multiple execution and data engines. We are providing a native implementation on Apache Spark, as it is today the most expressive in its DAG parallelization and also most powerful in its in-memory distributed dataset abstraction (RDD). With this release, DDF provides native Spark support for R, Python, Java, Scala.

An aim of the DDF project is to shine a focus of Big Data conversations on top-down, user-focussed simplicity and power, where "users" include business analysts, data scientists, and high-level Big Data engineers.


Directory Structure

Directory Description
bin useful helper scripts
exe DDF execution/launch scripts and executables
conf DDF configuration files
R DDF in R
python DDF in python
core DDF core API
spark DDF Spark implementation
examples DDF example API-user code
project Scala build config files

Getting Started

First clone or fork a copy of DDF, e.g.:

$ git clone https://github.com/ddf-project/DDF 

Now you need to prepare the build, which prepares the libraries, creates pom.xml in the various sub-project directories, and Eclipse .project and .classpath files.

$ cd DDF
$ bin/run-once.sh

If you ever need to regenerate the pom.xml files:

$ bin/make-poms.sh

The following regenerates Eclipse .project and .classpath files:

$ bin/make-eclipse-projects.sh

Building DDF_core or DDF_spark

$ (cd core ; mvn clean package)
$ (cd spark ; mvn clean package)

Running tests

$ bin/sbt test

or

$ (cd core ; mvn test)
$ (cd spark ; mvn test)

ddf's People

Contributors

huandao0812 avatar khangich avatar ljzzju avatar nhanitvn avatar qinxinwei avatar binhmop avatar piccolbo avatar pzzs avatar ctn avatar

Watchers

James Cloos avatar Nam avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.