davidandrzej / chisel Goto Github PK
View Code? Open in Web Editor NEWClojure wrapper for LDA topic modeling in MALLET
License: GNU General Public License v2.0
Clojure wrapper for LDA topic modeling in MALLET
License: GNU General Public License v2.0
# chisel This code provides a Clojure wrapper to do Latent Dirichlet Allocation (LDA) [1] topic modeling via the MALLET [2] Java API. For example, we may wish to run LDA directly against text records in a database without touching the local filesystem, programmatically extract the learned topics, and write these to a different database. The goal of this wrapper code is to make these kind of tasks easier. References [1] Blei, David M., Ng, Andrew Y., and Jordan, Michael I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR) 3 (Mar. 2003), 993-1022. [2] McCallum, Andrew K. "MALLET: A Machine Learning for Language Toolkit." http://mallet.cs.umass.edu. 2002. ## Usage This wrapper requires you to have already built and installed the MALLET jar to your local maven repo (see the above link [2]). You should then be able to build the wrapper with leiningen and use the chisel namespaces in your own code. For example: (ns my-example (:use (chisel instances lda))) (let [docs {"doc1" "cat cat bat bat" "doc2" "cat cat bat bat dog dog" "doc3" "dog dog dog"} inst (chisel.instances/get-instance-list docs) tm (chisel.lda/run-lda inst :T 2 :numiter 50 :topwords 3 :alpha 0.5)] (chisel.lda/write-topics tm "example.topics" :topwords 3)) ## License Copyright (c) 2011, Lawrence Livermore National Security, LLC. Produced at the Lawrence Livermore National Laboratory. Written by David Andrzejewski, [email protected] OCEC-10-073 All rights reserved. This file is part of the C-Cat package and is covered under the terms and conditions therein. See https://github.com/fozziethebeat/C-Cat for details. This code is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation and distributed hereunder to you. THIS SOFTWARE IS PROVIDED "AS IS" AND NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED ARE MADE. BY WAY OF EXAMPLE, BUT NOT LIMITATION, WE MAKE NO REPRESENTATIONS OR WARRANTIES OF MERCHANT- ABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE LICENSED SOFTWARE OR DOCUMENTATION WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS OR OTHER RIGHTS. You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.