Giter Site home page Giter Site logo

chisel's Introduction

# chisel

This code provides a Clojure wrapper to do Latent Dirichlet Allocation
(LDA) [1] topic modeling via the MALLET [2] Java API.  For example, we
may wish to run LDA directly against text records in a database
without touching the local filesystem, programmatically extract the
learned topics, and write these to a different database.  The goal of
this wrapper code is to make these kind of tasks easier.

References 

[1] Blei, David M., Ng, Andrew Y., and Jordan, Michael
I. (2003). Latent Dirichlet Allocation.  Journal of Machine Learning
Research (JMLR) 3 (Mar. 2003), 993-1022.

[2] McCallum, Andrew K.  "MALLET: A Machine Learning for Language
Toolkit."  http://mallet.cs.umass.edu. 2002.


## Usage

This wrapper requires you to have already built and installed the
MALLET jar to your local maven repo (see the above link [2]).

You should then be able to build the wrapper with leiningen and use
the chisel namespaces in your own code.  For example:

(ns my-example
 (:use (chisel instances lda)))

 (let [docs {"doc1" "cat cat bat bat"
             "doc2" "cat cat bat bat dog dog"
             "doc3" "dog dog dog"}
       inst (chisel.instances/get-instance-list docs)
       tm (chisel.lda/run-lda inst :T 2 :numiter 50 :topwords 3 :alpha 0.5)]
  (chisel.lda/write-topics tm "example.topics" :topwords 3))


## License

Copyright (c) 2011, Lawrence Livermore National Security, LLC. Produced at
the Lawrence Livermore National Laboratory. Written by David Andrzejewski,
[email protected] OCEC-10-073 All rights reserved.

This file is part of the C-Cat package and is covered under the terms
and conditions therein.  See https://github.com/fozziethebeat/C-Cat
for details.

This code is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation and distributed hereunder to
you.

THIS SOFTWARE IS PROVIDED "AS IS" AND NO REPRESENTATIONS OR
WARRANTIES, EXPRESS OR IMPLIED ARE MADE. BY WAY OF EXAMPLE, BUT NOT
LIMITATION, WE MAKE NO REPRESENTATIONS OR WARRANTIES OF MERCHANT-
ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE
LICENSED SOFTWARE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY
PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.