Giter Site home page Giter Site logo

pam's Introduction

PAM

PAM (Parallel Augmented Maps) is a parallel C++ library implementing the interface for sequence, ordered sets, ordered maps and augmented maps [2]. It uses the underlying balanced binary tree structure using join-based algorithms [1,2]. PAM is highly-parallelized, safe for concurrency, theoretically work-efficient, and supporting efficient GC. PAM supports four balancing schemes including AVL trees, red-black trees, treaps and weight-balanced trees (default).

PAM also implements another implementation of augmented maps, which is the prefix structure [3]. It is used for parallel sweepline algorithms for computational geometry problems.

This repository also contains examples of using PAM for a variety of applications, including range sum, 1D stabbing queries, 2D range queries, 2D segment queries, 2D rectangle queries, inverted index searching, implementing an HTAP database benchmark combining TPC-H queries and TPC-C transactions [5], version maintenance in single-writer transactional memory, etc.

Interface:

To define an augmented map using PAM, user need to specify the parameters including type names and (static) functions in the entry structure ``entry''.

  • typename key_t: the key type (K),
  • function comp: K x K -> bool: the comparison function on K (<_K)
  • typename val_t: the value type (V),
  • typename aug_t: the augmented value type (A),
  • function from_entry: K x V -> A: the base function (g)
  • function combine: A x A -> A: the combine function (f)
  • function get_empty: empty -> A: the identity of f (I)

Then an augmented map is defined with C++ template as

augmented_map<entry>.

Note that a plain ordered map is defined as an augmented map with no augmentation (i.e., it only has K, <_K and V in its entry) and a plain ordered set is similarly defined as an augmented map with no augmentation and no value type.

pam_map<entry>
pam_set<entry>

Here is an example of defining an augmented map "m" that has integer keys and values and is augmented with value sums (similar as the augmented sum example in our paper [1]):

struct entry {
  using key_t = int;
  using val_t = int;
  using aug_t = int;
  static bool comp(key_t a, key_t b) { 
    return a < b;}
  static aug_t get_empty() { return 0;}
  static aug_t from_entry(key_t k, val_t v) { 
    return v;}
  static aug_t combine(aug_t a, aug_t b) { 
    return a+b;}
};
augmented_map<entry> m;

Another quick example can be found in [2], which shows how to implement an interval tree using the PAM interface.

Hardware dependencies

Any modern (2010+) x86-based multicore machines. Relies on 128-bit CMPXCHG (requires -mcx16 compiler flag) but does not need hardware transactional memory (TSX). Most examples given in our scripts require 64GB memory, but range_query requires 256GB memory and aug_sum on the large input requires 1TB memory. All the examples can take smaller input size by setting command line arguments.

Software dependencies

PAM requires g++ 5.4.0 or later versions supporting the Cilk Plus extensions. The scripts that we provide in the repository use "numactl" for better performance. All tests can also run directly without "numactl".

We use python to write scripts to organize all results and compute speedup. It is not required to run tests.

Applications:

In the sub-directories, we show code for a number of applications. They are all concise based on PAM, which need about 100 lines of code each. All tests include parallel and sequential running times. The sequential versions are the algorithms running directly on one thread, and the parallel versions use all threads on the machine using "numactl -i all".

In each of the directories there is a separated readme, makefile and a script to run the timings for the corresponding application.

We recommend to use numactl -i all on all parallel tests.

General tests and range sum

In directory tests/. More details about the algorithms and results can be found in our paper [2]. The tests include commonly-used functions on sequences, ordered sets, ordered maps and augmented maps.

1D stabbing query

In directory interval/. More details about the algorithm and results can be found in our paper [2]. Our version implements a interval tree structure for stabbing queries.

2D range query

In directory range_query/. More details about the algorithm and results can be found in our paper [2,3]. We implemented both a range tree structure and a sweepline algorithm for 2D range queries.

2D segment query

In directory segment/. More details about the algorithm and results can be found in our paper [3]. We implemented both a segment tree structure and a sweepline algorithm for 2D segment queries.

2D rectangle query

In directory rectangle/. More details about the algorithm and results can be found in our paper [2,3]. We implemented both a tree structure and a sweepline algorithm for 2D rectangle queries (in directory common/).

Inverted index searching

In directory index/. More details about the algorithm and results can be found in our paper [2]. Our code builds an inverted index of a dataset of wikipedia documents.

Version maintenance for single-writer transactional memories

In directly concurrency/. More details about the algorithm and results can be found in our paper [4]. We implemented both a simple lock-free algorithm and a more involved wait-free algorithm.

An HTAP database system

To be updated

Reference

[1] Yihan Sun, Daniel Ferizovic, and Guy E. Blelloch. Parallel Ordered Sets Using Just Join. SPAA 2016.

[2] Yihan Sun, Daniel Ferizovic, and Guy E. Blelloch. PAM: Parallel Augmented Maps. PPoPP 2018.

[3] Yihan Sun and Guy Blelloch. Parallel Range, Segment and Rectangle Queries with Augmented Maps. ArXiv:1803.08621.

[4] Naama Ben-David, Guy E. Blelloch, Yihan Sun and Yuanhao Wei. Efficient Single Writer Concurrency. ArXiv:1803.08617.

[5] TPC benchmarks. http://www.tpc.org/

pam's People

Contributors

syhlalala avatar gblelloch avatar jmftrindade avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.