Giter Site home page Giter Site logo

columnstore-innodb-monetdb's Introduction

columnstore-innodb-monetdb

Context

This project was part of a university assignment with the following goals:

  • Compare MariaDB ColumnStore, InnoDB and MonetDB in terms of performance
  • Check ease of installation
  • Check ease of use (how easy are queries to execute?)
  • Powerfulness of the language (which applications are particularly well supported?)
  • How can certain queries be executed particularly fast?
  • Extended summary of the above questions in word document (incl. source code)

Of the above, only the performance timings are included in this README - for the other topics please refer to the Word document (available in English and German).

The flight data from the U.S. Bureau of Transportation Statistics was used as the original data basis.

For this project it had to be manually downloaded for each month as well as cleaned and merged using custom python code (see ./Python code).

All necessary SQL commands to connect to the DBMS, create a table, import the data and run the queries can be found in the folder ./SQL code.

The cleaned data has been uploaded on Kaggle. Its size is 2.7GB uncompressed / 377MB compressed.

Also take a look at the sources used throughout this project in ./sources.md.

Performance

The following timings were taken on Windows 10 running a virtual machine via VMWare Workstation 16 running CentOS 8 with 4 CPU cores (Intel i7 9700K), 8GB DDR4 RAM as well as 50GB fixed storage (Samsung 970 EVO NVMe M.2).

CSV Import timings

Engine 36M records 5.6M records 500K records Method
Columnstore 00m 17s 4.2s 1.1s Columnstore cpimport
MonetDB 01m 07s 15s 1s MonetDB CSV Bulk Loads
InnoDB 16m 40s 2m 37s 13s LOAD DATA INFILE

Query timings (warm)

All queries were performed multiple times in a row (except for InnoDB) to make most use of the runtime optimization strategies that the DBMS employ.

Query statement Description MonetDB Columnstore InnoDB
select * from <table> where <column> = <value> limit 10; select where 0.06s 0.1s 0.02s
select count(*) from <table> where <column> = <value>; count where 0.04s 0.6s 0.02s
select count(*) from <table>; simple count 0.003s 0.3s 5m 8s
select avg(DISTANCE) as AVERAGE from <table> aggregation 0.03s 0.6s 5m 38s
select * from <table> where CRS_DEP_TIME > '10:00' and CRS_ARR_TIME < '15:00' and AIR_TIME > 300 order by AIR_TIME desc limit 10; complex where 0.1s 0.8s 6m 6s
select <columns>, FLOOR(DEP_DELAY_NEW/60) as DELAY_IN_HOURS from <table> order by DELAY_IN_HOURS desc limit 10; TOP10 delayed 0.3s 8s 6m 33s

columnstore-innodb-monetdb's People

Contributors

bingecode avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.