Giter Site home page Giter Site logo

acdmammoths / datasetgenerator Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 4.0 43 KB

A porting to modern g++ and C+11 of the IBM Quest dataset generator

License: Apache License 2.0

C 78.73% C++ 16.87% Shell 2.63% Makefile 1.77%
datamining datamining-algorithms pattern-mining

datasetgenerator's Introduction

Dataset Generator

A porting to C++11 and modern g++ of the IBM Quest dataset generator.

Copyright (c) 2014-22 Matteo Riondato [email protected]

History

Once upon a time (possibly on July 22 1997), the IBM Almaden group published a synthetic transactional dataset generator. The workings of the generator were described in 'R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB โ€™94, pages 487โ€“499, 1994'. The original version could be downloaded from http://www.almaden.ibm.com/cs/quest/syndata.html (that page no longer exists).

After a while (around 2002-2003), the IBM version could not be compiled with "modern" g++ compilers anymore. Paolo Palmerini ([email protected]) published a g++-compilable modified version of the generator on his website (also no longer existed). Paolo's modifications can be found in the source code by looking for comments highlighted by "// g++".

As of June 2014 Paolo's version does not compile on current g++ (e.g., g++-4.9). Matteo Riondato ([email protected]) took up the task to port the code to C++11 and make it compile with the current g++.

License

The code published on Paolo Palmerini's website comes without any licensing information and Matteo Riondato is not aware of the licensing of the original IBM code. If you have information about this, please let Matteo know by writing an email to [email protected]. Assuming the code was in the public domain, Matteo is licensing this code under the Apache License, Version 2.0. See the LICENSE file and the NOTICE file.

Install

Just run "make" and the code will be compiled. It uses whatever "g++" is on your system. You can specify your favourite version by modifying the Makefile.

Running

TODO: add information abount running

datasetgenerator's People

Contributors

kim135797531 avatar rionda avatar

Stargazers

 avatar  avatar

Watchers

 avatar

datasetgenerator's Issues

Sefault in choose.h

Choose::num is never initialized to have a certain size, and so the subsequent accesses are segfaulting.

Proposed solution: change line 14 from Choose(LINT n, LINT k) { to Choose(LINT n, LINT k) : num(n) {

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.