Giter Site home page Giter Site logo

ubd's Introduction

DirBN

The demo code of PFA+DirBN in the paper of "Dirichlet belief networks for topic structure learning", NeurIPS 2018 Arxiv.

Key features:

  1. DirBN discovers topic hierarchies on topic-word distributions.
  2. DirBN flexibly combines with many other topic models.
  3. DirBN enjoys better perplexity and topic coherence, especially for short texts.

Run PFA+DirBN

  1. The code is a mixture of Matlab and C++. The code has been tested in MacOS and Linux (Ubuntu). To run it on Windows, you need to re-compile all the .c files with MEX and a C++ complier.

  2. Requirements: Matlab 2016b (or later).

  3. We have offered the TMN dataset used in the paper, which is stored in MAT format, with the following contents:

  • x: a V by N count (sparse) matrix for N documents with V words in the vocabulary
  • voc: the words in the vocabulary
  • train_idx: the indexes of documents for training
  • test_idx: the indexes of documents for testing

Please prepare your own documents in the above format. If you want to use this dataset, please cite the original papers, which are cited in our paper.

  1. Run PFA_DirBN_demo.m

Use DirBN with other models

DirBN is a hierarchical construction on top of topic-word distributions and leaves the construction on doc-word distributions untouched. init_DirBN.m, sample_DirBN.m, sample_DirBN_beta.m, and sample_DirBN_counts.m can be viewed as an independent package of DirBN.

To combine DirBN with other topic models than PFA, simply call init_DirBN.m before the inference begins and call sample_DirBN.m in each iteration after the topic assignments are sampled.

Notes

  1. CRT_sum_mex_matrix_v1.c, CRT_sum_mex_v1.c, Mult_Sparse.c, Multrnd_Matrix_mex_fast_v1.c, PartitionX_v1.m, and Sample_rk.m are borrowed from GBN of Mingyuan Zhou. If you want to use the above code please cite the related papers. collapsed_gibbs_topic_assignment_mex.c is modified from the code of GBN.

  2. If you find any bugs, please contact me by email ([email protected]).

ubd's People

Contributors

ethanhezhao avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.