nuivall / naive-bayes Goto Github PK
View Code? Open in Web Editor NEWThis project forked from timnugent/naive-bayes
Naïve Bayes classifier with gaussian, multinomial and bernoulli decision rules [machine learning][statistics]
This project forked from timnugent/naive-bayes
Naïve Bayes classifier with gaussian, multinomial and bernoulli decision rules [machine learning][statistics]
Naive Bayes Classifier ---------------------- (c) Tim Nugent Compile by running 'make'. Uses std=c++11 - on older compilers you may need to change this to 'std=c++0x' in the Makefile. Run all tests with 'make test' Run the train/classify tool as follows: ./nb_classify -d 1 training.data test.data Training and test data should be in svm-light/libsvm format, e.g. +1 1:4.12069 2:11.3896 3:18.5742 4:2.85764 5:53.4406 -1 1:3.14565 2:17.4338 3:19.3353 4:2.63431 5:56.4233 Sparse data is not (yet) supported. The -d flag is the decision rule, option 1 = Gaussian (default), 2 = Multinomial and 3 = Bernoulli. -v flag turns verbose mode on - use this to see the classification results. The -a flag passes the smoothing parameter used in the multinomial decision rule (default 1.0, Laplace smoothing). Gaussian example ---------------- Generate Gaussian training and test data as follows: ./generate_gaussian_data 5000 +1 5.3 2.0 12.8 3.1 18.2 1.0 1.2 3.4 56.2 2.3 > training.data ./generate_gaussian_data 5000 -1 7.3 1.0 10.8 1.1 24.1 2.3 0.8 1.2 53.1 0.8 >> training.data ./generate_gaussian_data 1000 +1 5.3 5.0 12.8 5.1 18.2 4.0 1.2 5.4 56.2 4.3 > test.data ./generate_gaussian_data 1000 -1 7.3 4.0 10.8 4.1 24.1 5.3 0.8 4.2 53.1 2.8 >> test.data Arguments are the number of examples, the class label, then pairs of means and standard deviations. Note that I bumped up the standard deviations in the test data to make classification harder. Then train and classify as follows: ./nb_classify -d 1 training.data test.data Multinomial example ------------------- Example based on table 13.1 from here: http://nlp.stanford.edu/IR-book/html/htmledition/naive-bayes-text-classification-1.html echo "+1 1:2 2:1 3:0 4:0 5:0 6:0" > training.data echo "+1 1:2 2:0 3:1 4:0 5:0 6:0" >> training.data echo "+1 1:1 2:0 3:0 4:1 5:0 6:0" >> training.data echo "-1 1:1 2:0 3:0 4:0 5:1 6:1" >> training.data echo "+1 1:3 2:0 3:0 4:0 5:1 6:1" > test.data Then train and classify as follows: ./nb_classify -d 2 training.data test.data Bernoulli example ----------------- Example taken from here: http://www.inf.ed.ac.uk/teaching/courses/inf2b/learnnotes/inf2b-learn-note07-2up.pdf echo "+1 1:1 2:0 3:0 4:0 5:1 6:1 7:1 8:1" > training.data echo "+1 1:0 2:0 3:1 4:0 5:1 6:1 7:0 8:0" >> training.data echo "+1 1:0 2:1 3:0 4:1 5:0 6:1 7:1 8:0" >> training.data echo "+1 1:1 2:0 3:0 4:1 5:0 6:1 7:0 8:1" >> training.data echo "+1 1:1 2:0 3:0 4:0 5:1 6:0 7:1 8:1" >> training.data echo "+1 1:0 2:0 3:1 4:1 5:0 6:0 7:1 8:1" >> training.data echo "-1 1:0 2:1 3:1 4:0 5:0 6:0 7:1 8:0" >> training.data echo "-1 1:1 2:1 3:0 4:1 5:0 6:0 7:1 8:1" >> training.data echo "-1 1:0 2:1 3:1 4:0 5:0 6:1 7:0 8:0" >> training.data echo "-1 1:0 2:0 3:0 4:0 5:0 6:0 7:0 8:0" >> training.data echo "-1 1:0 2:0 3:1 4:0 5:1 6:0 7:1 8:0" >> training.data echo "+1 1:1 2:0 3:0 4:1 5:1 6:1 7:0 8:1" > test.data echo "-1 1:0 2:1 3:1 4:0 5:1 6:0 7:1 8:0" >> test.data ./nb_classify -d 3 training.data test.data Generating dummy data --------------------- See above for Gaussian data. For multinomial data: ./generate_multinomial_data 5000 +1 2 3 4 3 5 > training.data ./generate_multinomial_data 5000 -1 1 4 3 2 4 >> training.data The five weights are used to generate integers with probability weight/sum of weights. For Bernoulli data: ./generate_bernoulli_data 5000 +1 0.7 0.1 0.5 > training.data ./generate_bernoulli_data 5000 -1 0.4 0.2 0.6 >> training.data The three probabilities are used to generate binary variables. Note that the prior class probabilites are calculated based on their ratios in the training data.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.