Giter Site home page Giter Site logo

mltk-algo-contrib's Introduction

mltk-algo-contrib

This repo contains custom algorithms for use with the Splunk Machine Learning Toolkit. The repo itself is also a Splunk app. Custom algorithms can be added to the Splunk Machine Learning toolkit by adhering to the ML-SPL API. The API is a thin wrapper around machine learning estimators provided by libraries such as:

and custom algorithms.

Note that this repo is a collection of custom algorithms only, and not any libraries. Any libraries required should only be added to live environments manually and not to this repo.

A comprehensive guide to using the ML-SPL API can be found here.

A very simple example:

from base import BaseAlgo


class CustomAlgorithm(BaseAlgo):
    def __init__(self, options):
        # Option checking & initializations here
        pass

    def fit(self, df, options):
        # Fit an estimator to df, a pandas DataFrame of the search results
        pass

    def partial_fit(self, df, options):
        # Incrementally fit a model
        pass

    def apply(self, df, options):
        # Apply a saved model
        # Modify df, a pandas DataFrame of the search results
        return df

    @staticmethod
    def register_codecs():
        # Add codecs to the codec manager
        pass

Dependencies

To use the custom algorithms contained in this app, you must also have installed:

Usage

This repository is contains public contributions and Splunk is not responsible for guaranteeing the correctness or validity of the algorithms. Splunk is in no way responsible for the vetting of the contents of contributed algorithms.

Deploying

To use the custom algorithms in this repository, you must deploy them as a Splunk app.

There are two ways to do this.

Manual copying

You can simple copy the following directories under src:

  • bin
  • default
  • metadata

to:

  • ${SPLUNK_HOME}/etc/apps/SA_mltk_contrib_app (you will need to create the directory first):

OR

Build and install

1. Build the app:

You will need to install tox. See Test Prerequisites

tox -e package-macos        # if on Mac
tox -e package-linux        # if on Linux
  • The resulting gzipped tarball will be in the target directory (e.g. target/SA_mltk_contrib_app.tgz).

    • The location of the gzipped tarball can be overridden by BUILD_DIR environment variable.
  • The default app name will be SA_mltk_contrib_app, but this can be overridden by the APP_NAME environment variable.

  • NOTE: You can run tox -e clean to remove the target directory.

2. Install the tarball:

Contributing

This repository was specifically made for your contributions! See Contributing for more details.

Developing

To start developing, you will need to have Splunk installed. If you don't, read more here.

  1. clone the repo and cd into the directory:
git clone https://github.com/splunk/mltk-algo-contrib.git
cd mltk-algo-contrib
  1. symlink the src directory to the apps folder in Splunk and restart splunkd:
ln -s "$(pwd)/src" $SPLUNK_HOME/etc/apps/SA_mltk_contrib_app
$SPLUNK_HOME/bin/splunk restart
  • This will eliminate the need to deploy the app to test changes.
  1. Add your new algorithm(s) to src/bin/algos_contrib. (See SVR.py for an example.)

  2. Add a new stanza to src/default/algos.conf

[<your_algo>]
package=algos_contrib
  • NOTE: Due to the way configuration file layering works in Splunk, the package name must be overridden in each section, and not in the default section.
  1. Add your tests to src/bin/algos_contrib/tests/test_<your_algo>.py (See test_svr.py for an example.)

Running Tests

Prerequisites

  1. Install tox:

    pip install tox
  2. Install tox-pip-extensions:

    pip install tox-pip-extensions
    • NOTE: You only need this if you do not want to recreate the virtualenv(s) manually with tox -r everytime you update requirements*.txt file, but this is recommended for convenience.
  3. You must also have the following environment variable set to your Splunk installation directory (e.g. /opt/splunk):

    • SPLUNK_HOME

Using tox

To run all tests, run the following command in the root source directory:

tox

To run a single test, you can provide the directory or a file as a parameter:

tox src/bin/algos_contrib/tests/
tox src/bin/algos_contrib/tests/test_example_algo.py
...

Basically, any arguments passed to tox will be passed as an argument to the pytest command. To pass in options, use double dashes (--):

tox -- -k "example"     # Run tests that has keyword 'example'
tox -- -x               # Stop after the first failure
tox -- -s               # Show stdout/stderr (i.e. disable capturing)
...

Using Python REPL (Interactive Interpreter)

$ python   # from src/bin directory
>>> # Add the MLTK to our sys.path
>>> from link_mltk import add_mltk
>>> add_mltk()
>>>
>>> # Import our algorithm class
>>> from algos_contrib.ExampleAlgo import ExampleAlgo
... (some warning from Splunk may show up)
>>>
>>> # Use utilities to catch common mistakes
>>> from test.contrib_util import AlgoTestUtils
>>> AlgoTestUtils.assert_algo_basic(ExampleAlgo, serializable=False)

Package/File Naming

Files and packages under test directory should avoid having names that conflict with files or directories directly under:

$SPLUNK_HOME/etc/apps/Splunk_ML_Toolkit/bin

Pull requests

Once you've finished what you're adding, make a pull request.

Bugs? Issues?

Please file issues with any information that might be needed to:

  • reproduce what you're experiencing
  • understand the problem fully

License

The algorithms hosted, as well as the app itself, is licensed under the permissive Apache 2.0 license.

Any additions to this repository must be under one of these licenses:

  • MIT
  • BSD
  • Apache 2.0

mltk-algo-contrib's People

Contributors

grana15 avatar pckim93 avatar pksplunk avatar martian18 avatar abhagat-splunk avatar geekusa avatar albertemily avatar asteinsplunk avatar pdrieger avatar ahallur avatar metasyn avatar alarson-splunk avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.