Giter Site home page Giter Site logo

jostheim / loom Goto Github PK

View Code? Open in Web Editor NEW

This project forked from posterior/loom

0.0 2.0 0.0 8.33 MB

A streaming cross-cat inference engine

License: BSD 3-Clause "New" or "Revised" License

CMake 0.53% Makefile 0.25% Python 43.77% Shell 0.12% C++ 53.75% Protocol Buffer 1.58%

loom's Introduction

Build Status

Loom

Loom is a streaming inference and query engine for the Cross-Categorization model mansinghka2009cross, shafto2011probabilistic.

Data Types

Loom learns models of sparse heterogeneous tabular data, with hundreds of features and millions of rows. Loom currently supports the following feature types and models:

  • boolean fields as Beta-Bernoulli
  • categorical fields with up to 256 values as Dirichlet-Discrete
  • unbounded categorical fields as Dirichlet-Process-Discrete
  • count fields as Gamma-Poisson
  • real fields as Normal-Inverse-Chi-Squared-Normal
  • sparse real fields as mixture of degenerate and dense real
  • text and keyword fields as booleans for word absence/presence
  • date fields as a combination of absolute, relative, and cyclic parts
  • optional fields as a boolean plus one of the above feature models

See input format docs for details.

Data Scale

Loom targets tabular datasets of sizes 100-1000 columns 10^3-10^9 rows. To handle large datasets, loom implements subsample annealing obermeyer2014scaling with an accelerating annealing schedule and adaptively turns off ineffective inference strategies. Loom's annealing schedule is tuned to learn 10^8 cell datasets in under an hour and 10^10 cell datasets in under a day (depending on feature type and sparsity).

   Full Inference:     Partial Inference:  Greedy Inference:
   structure
   hyperparameters     hyperparameters
   mixtures            mixtures            mixtures
 |-------------------> ------------------> ------------------>
 1   many-passes   ~10^4   accelerate   10^9   single-pass  10^4
row                 rows                rows               row/sec

Documentation

Authors

Loom is a streaming rewrite of the TARDIS engine developed by Eric Jonas https://twitter.com/stochastician at Prior Knowledge, Inc.

Loom relies heavily on the distributions library.

License

Copyright (c) 2014 Salesforce.com, Inc. All rights reserved. Copyright (c) 2015, Google, Inc.

Licensed under the Revised BSD License. See LICENSE.txt for details.

The PreQL query interface is covered by US patents pending:

  • Application No. 14/014,204
  • Application No. 14/014,221
  • Application No. 14/014,225
  • Application No. 14/014,236
  • Application No. 14/014,241
  • Application No. 14/014,250
  • Application No. 14/014,258

Dependencies

loom's People

Contributors

fritzo avatar jglidden-salesforce avatar jglidden avatar beaucronin avatar cap avatar jostheim avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.