Removing File-IO

SRLBoost

A package for learning Statistical Relational Models with Gradient Boosting, forked for use as srlearn's core.

It's basically BoostSRL but half the size and significantly faster.

Graphs at commit cb952a4, measured with cloc-1.84.

How much faster?

(Smaller numbers are better.)

This box plot compares the learning time (in seconds) for three data sets and three implementations of learning relational dependency networks. BoostSRL-Lite was built from the repository on GitHub, and BoostSRL_v1.1.1 is the latest official release.

Each data set included 4-5 cross validation folds, and these results were averaged over 10 runs. This appears to suggest that SRLBoost is at least twice as fast as other implementations.

With some parameter tuning we have sped this up even further.

The tiny bar on the left shows the average SRLBoost time for Cora is around 17 seconds, compared to around 4.5 minutes for BoostSRL-Lite and BoostSRL (that's more like 15x faster).

However, on Cora this does lead to slightly degraded performance in AUC ROC, AUC PR, and conditional log likelihood (CLL); shown in the table below.

Implementation	mean AUC ROC	mean AUC PR	mean CLL	mean F1
SRLBoost	0.61	0.93	-0.27	0.96
BoostSRL-Lite	0.65	0.94	-0.29	0.96
BoostSRLv1.1.1	0.65	0.94	-0.29	0.78

[Measurements used to produce this table are available online (three_jar_comparison.csv)]

A main aim for this project is to have a faster library. We have made the faster parameters the defaults, and intend to expose them as things that users can tune in instances where slower, more effective learning is critical.

Getting Started

SRLBoost project structure still closely mirrors other implementations.

We're using Gradle to help with building and testing, targeting Java 8.

Windows Quickstart

Open Windows Terminal in Administrator mode, and use Chocolatey (or your preferred package manager) to install a Java Development Kit.

choco install openjdk

Clone and build the package.

git clone https://github.com/srlearn/SRLBoost.git
cd .\SRLBoost\
.\gradlew build

Learn with a basic data set (switching the X.Y.Z):

java -jar .\build\libs\srlboost-X.Y.Z.jar -l -train .\data\Toy-Cancer\train\ -target cancer

Query the model on the test set (again, swtiching the X.Y.Z)

java -jar .\build\libs\srlboost-X.Y.Z.jar -i -model .\data\Toy-Cancer\train\models\ -test .\data\Toy-Cancer\test\ -target cancer

MacOS / Linux

Open your terminal (MacOS: ⌘ + spacebar + "Terminal"), and use Homebrew to install a Java Development Kit. (On Linux: apt, dnf, or yum depending on your Linux flavor).

brew install openjdk

Clone and build the package.

git clone https://github.com/srlearn/SRLBoost.git
cd SRLBoost/
./gradlew build

Run a basic example (switching the X.Y.Z):

java -jar build/libs/srlboost-X.Y.Z.jar -l -train data/Toy-Cancer/train/ -target cancer

Query the model on the test set (again, swtiching the X.Y.Z)

java -jar build/libs/srlboost-X.Y.Z.jar -i -model data/Toy-Cancer/train/models/ -test data/Toy-Cancer/test/ -target cancer

	Utils.println("\n% Running command: " + command); // See http://mark.goadrich.com/programs/AUC/
	Process p = Runtime.getRuntime().exec(command);
	InputStream is = p.getInputStream();

	* TODO Write our own code OR get the source code for the JAR to compute AUC
	* @author Tushar Khot

srlearn / srlboost Goto Github PK

srlboost's Introduction

SRLBoost

It's basically BoostSRL but half the size and significantly faster.

How much faster?

Getting Started

Windows Quickstart

MacOS / Linux

srlboost's People

Contributors

Stargazers

Watchers

Forkers

srlboost's Issues

Recommend Projects

Recommend Topics

Recommend Org