Giter Site home page Giter Site logo

codeaudit / jpmml-hive Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jpmml/jpmml-hive

0.0 2.0 1.0 172 KB

PMML evaluator library for the Apache Hive data warehouse software (http://hive.apache.org)

License: GNU Affero General Public License v3.0

Java 100.00%

jpmml-hive's Introduction

JPMML-Hive Build Status

PMML evaluator library for the Apache Hive data warehouse software (http://hive.apache.org).

Features

Prerequisites

  • Apache Hive version 0.12.0 or newer.

Overview

A working JPMML-Hive setup consists of a library JAR file and a number of model JAR files. The library JAR is centered around the utility class org.jpmml.hive.PMMLUtil, which provides Hive compliant utility methods for handling most common PMML evaluation scenarios. A model JAR file contains one or more model launcher classes and a PMML resource.

The main responsibility of a model launcher class is to formalize the "public interface" of a PMML resource. A model launcher class must extend abstract Hive user-defined function (UDF) class org.apache.hadoop.hive.ql.udf.generic.GenericUDF and provide concrete implementations for the following methods:

  • #initialize(ObjectInspector[]). The initialization of argument types is handled by the method PMMLUtil#initializeArguments(Class, ObjectInspector[]). The initialization of the result type is handled either by the method PMMLUtil#initializeSimpleResult(Class) or PMMLUtil#handleComplexResult(Class).
  • #evaluate(GenericUDF.DeferredObject[]). Handled either by the method PMMLUtil#evaluateSimple(Class, ObjectInspector[], GenericUDF.DeferredObject[]) or PMMLUtil#evaluateComplex(Class, ObjectInspector[], GenericUDF.DeferredObject[]).
  • #getDisplayString(String[]). Handled by the method PMMLUtil#getDisplayString(String, String[]).

All in all, a typical model launcher class can be implemented in 15 to 20 lines of boilerplate-esque Java source code.

The example model JAR file contains a DecisionTree model for the "iris" dataset. This model is exposed in two ways. First, the model launcher class org.jpmml.hive.DecisionTreeIris defines a custom function that returns the PMML target field ("Species") together with four output fields ("Predicted_Species", "Probability_setosa", "Probability_versicolor", "Probability_virginica") as a struct. Second, the model launcher class org.jpmml.hive.DecisionTreeIris_Species defines a custom function that returns the PMML target field ("Species") as a string.

Installation

Enter the project root directory and build using [Apache Maven] (http://maven.apache.org/):

mvn clean install

The build produces two JAR files:

  • pmml-hive/target/pmml-hive-runtime-1.0-SNAPSHOT.jar - Library uber-JAR file. It contains the classes of the library JAR file pmml-hive/target/pmml-hive-1.0-SNAPSHOT.jar, plus all the classes of its transitive dependencies.
  • pmml-hive-example/target/pmml-hive-example-1.0-SNAPSHOT.jar - Example model JAR file.

Usage

Library

Installation

Add the library uber-JAR file to Hive classpath:

ADD JAR /tmp/pmml-hive-runtime-1.0-SNAPSHOT.jar;

Example model

Installation

Add the example model JAR file to Hive classpath:

ADD JAR /tmp/pmml-hive-example-1.0-SNAPSHOT.jar;

Declare custom functions based on UDF implementation classes:

CREATE TEMPORARY FUNCTION DecisionTreeIris AS 'org.jpmml.hive.DecisionTreeIris';
CREATE TEMPORARY FUNCTION DecisionTreeIris_Species AS 'org.jpmml.hive.DecisionTreeIris_Species';
Usage

Execute a custom function using a list of scalar arguments:

SELECT DecisionTreeIris(5.1, 3.5, 1.4, 0.2);

Execute a custom function using a struct argument:

SELECT DecisionTreeIris(named_struct('Sepal_Length', 5.1, 'Sepal_Width', 3.5, 'Petal_Length', 1.4, 'Petal_Width', 0.2));

License

JPMML-Hive is dual-licensed under the [GNU Affero General Public License (AGPL) version 3.0] (http://www.gnu.org/licenses/agpl-3.0.html) and a commercial license.

Additional information

Please contact [[email protected]] (mailto:[email protected])

jpmml-hive's People

Contributors

vruusmann avatar

Watchers

 avatar  avatar

Forkers

nkamsteve

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.