jacobeisenstein / dpmm Goto Github PK
View Code? Open in Web Editor NEWDirichlet process mixture model code in Matlab. Sampling and variational.
Dirichlet process mixture model code in Matlab. Sampling and variational.
This is a matlab library for Gaussian Dirichlet Process Mixture Models (DPMMs). It includes both variational and Monte Carlo inference. To test / see how this program works, run demodpmm.m in matlab This code was mostly written in 2007. When I found out it was referenced in a paper in 2012, I made a few cosmetic changes and put it on Github. It's not guaranteed to work perfectly. You should check it before using it for anything really important. Some of the sampling code is built on top of software by Michael Mandel, which was also released under GPL. ===================================================== COPYRIGHT / LICENSE ===================================================== Unless otherwise indicated in the specific file, code was written by Jacob Eisenstein, and is copyrighted under the GPL: Copyright (C) 2007-2012 Jacob Eisenstein This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
Hi all, in line 108, deriv_up is defined as 2 / (n - k + 3/2) while deriv_down is defined as k * n / (n - k + 1). Has anyone actually tried to derive these 2 equations since author mentioned it is guaranteed to produce positive and negative derivative?
I tried deriving myself thinking they are linked to the equation of hprime. However, if I tried to let hprime = 0, I will get stuck with the harmonic series. Unless my direction is wrong, such that those two equation are not related to hprime.
Update (Here is what I tried):
If I have to let HPrime be more than 0, I would have to approximate the harmonic series as na and eventually derive the equation that a must be bigger than 1/(2(n-k-3/2) which I can easily assume that deriv_up = 1 / (n - k + 3/2) or deriv_up = 2 / (n - k + 3/2) (as written in the code) which are both bigger than 1/(2*(n-k-3/2).
Whereas for HPrime be less than 0 (a(k-3/2)+1+a^2(digamma(a)-a^2(digamma(a+n) <0). Firstly, a must be a postive value for MATLAB psi to work and digamma(a) must produce a positive number so no matter how big k is, as long as a is bigger than k then we would get a negative derivative. So letting deriv_down = k +1 would produce a negative derivative as well. However, I am not sure if my assumptions are correct though..
In the code, p_prior is written as p_prior = params(it).counts + params(it).alpha * (params(it).counts == 0);
which lets the number of observations in k cluster be the probability of joining k cluster.
But i thought, it should be divided by (a+N-1) ? which will gives
p_prior = (params(it).counts + params(it).alpha * (params(it).counts == 0))./(params(it).alpha+sum(params(it).counts)-1);
Or is the division redundant?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.