Giter Site home page Giter Site logo

orientdb-pokec-benchmarks's Introduction

orientdb-pokec-benchmark

OrientDB benchmarks are running on Pokec database provided by SNAP https://snap.stanford.edu/data/soc-pokec.html . Following workloads are implemented.

  1. pokecLoad - Loading of initial data into database.
  2. pokecRead - Reading of N profiles from database using Zipfian distribution.
  3. pokecUpdate - Updating of N profiles from database using Zipfian distribution.

At the end of each workload CSV file with statistics is created. CSV file consist of following columns:

  1. Number of operations performed.
  2. Avg. time in microseconds of execution of operation, it is calculated for period of time between last report time and current report time, it is not avg. time for all duration of workload.
  3. Throughput of operations, which is calculated again for interval between current and previous reports, not throughput for all duration of workload.

Last line of CSV file contains information about avg. operation execution time in microseconds and throughput for all duration of benchmark, also it contains total amount of operations performed during workload. Name of CSV file is created using following format: <name of workload> <data of workload><csv suffix if any>.csv All workloads generate single report except of initial load of data. It generates two reports. One for loading of profiles and one for loading of relations between them.

Database schema consist of two indexes one for id of the persons profile, and one for artificial string key which is generated during the load. This artificial key is used then across all workloads as primary key. Type of index can be chosen during initial load of data.

Following parameters are supported:

  1. embedded - indicates whether embedded or remote storage is used for benchmark(true of false, true by default).
  2. engineDirectory - Path to the directory where all embedded databases will be stored (./build/databases by default).
  3. dbName - Name of the database which will be used for benchmarks (pokec by default).
  4. csvSuffix - Suffix which is added to any CSV report. Very handy if you want to compare performance of different versions of product (empty by default).
  5. remoteURL - URL to the remote server.
  6. numThreads - Amount of threads to use for workload (8 by default).
  7. indexType - Type of index is used in pokec benchmark, possible values are: 'tree', 'hash', 'autosharded'. By default autosharded index is used.
  8. warmUpOperations - Amount of operations executed during database warmup (2 * amount of profiles by default).
  9. operations - Amount of operations executed during database workload (4 * amount of profiles by default).

To pass those parameters following syntax is used -P<param name>=<param value> To run a workload use following syntax gradle <workload name> <parameters>. For example:

  1. To initially load data you can use following command gradle pokecLoad -PcsvSuffix=\(phlogging,tree\) -PindexType=tree. It will load data into pokec database and will use hash index for all database indexes. All CSV reports will have suffix (phloggin, tree) at the end.
  2. To run update workload you can use following command gradle pokecUpdate -PcsvSuffix=\(phlogging,tree\). It will run workload which updates content of user profiles and CSV report will have suffix (phloggin, tree) at the end.

orientdb-pokec-benchmarks's People

Contributors

laa avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.