AllConcur

Algorithm for LeaderLess CONCURrent atomic broadcast

Brief

An implementation of the AllConcur algorithm over both TCP/IP and InfiniBand Verbs [1].

Building benchmarks

Prerequisites

libev (version >= 4.15)
gmplib (version >= 6.1.0)
mpfrlib (version >= 3.1.3)

Generate Makefile

./configure --help
usage  : $0 [options]  
options: [--prefix=DIR]                # Installation directory  
         [--with-ev=DIR]               # ev installation directory  
         [--with-mpfr=DIR]             # mpfr installation directory  
         [--with-gmp=DIR]              # gmp installation directory  
         [--enable-ibv]                # enable IBV network module  
         [--with-ibv=DIR]              # IB verbs installation directory  
         [--with-rdmacm=DIR]           # RDMA-CM installation directory

Build

make benchmark

Running benchmarks

Directly start an AllConcur server

$ ./bin/allconcur -h
Usage: ./bin/allconcur [OPTIONS]
OPTIONS:
	-c <config-file>         configuration file
	[--tcp | --ibv]          server-interconnection type (dafault=ibv)
	[-n <size>]              number of nodes (default=6)
	[-r <k-nines>]           required reliability (default=6)
	[--join]                 join group a servers
	   -s <hostname>         [join] server's hostname
	   -p <port>             [join] server's port
	[--timer]                get runtime estimates (default no)
	[--latency]              run latency benchmark
	[--throughput]           run peak throughput benchmark
	[--failure]              run failure benchmark

By default, each AllConcur server periodically generates requests; check the config file (e.g., allconcur.cfg) for the request size (i.e., req_size) and the request generating period (i.e., req_period). Other benchmarks:

latency -- only one server A-broadcast a single request;
throughput -- each server batches more requests in a single message, which is then A-broadcast; max_batching (see config file) gives the maximum batching factor;
failure -- fault-tolerant agreement.

Start an AllConcur server through MPI

The benchmarks can be started also through the following MPI launcher (this automatically creates the configuration file):

$ ./bin/allconcur_launcher -h
Usage: ./bin/allconcur_launcher [OPTIONS]
OPTIONS:
	-n <size>                    number of nodes
	[--tcp | --ibv]              server-interconnection type (default=ibv)
	[-r <k-nines>]               required reliability (default=6)
	[-p <port>]                  inter-server port (default=55000)
	[-s <size>]                  request size in bytes (default=64)
	[--timer]                    timer
	[--treq]                     run request rate benchmark
	[--ctreq]                    run constant request rate benchmark
	   [-P <req_period>]         [treq|ctreq] period of generating request in ms (default=1)
	[--latency]                  run latency benchmark
	[--throughput]               run peak throughput benchmark (default)
	   [-b <batching factor>]    [throughput] max batching factor (default=512)
	[--failure]                  run failure benchmark

e.g., the following command starts the throughput benchmark

mpirun -np 8 ./bin/allconcur_launcher -n 8 -s 1024 --throughput -b 512

Folder overview

./bin -- binaries
./lib -- libraries
./src -- source files
- allconcur.c -- main function
- ac_init.c -- initialization
- ac_algorithm.c -- the AllConcur algorithm
- ac_fd.c -- failure detector
- messages.c -- message queue (for pending and premature messages)
- request_pool.c -- request pool (circular buffer)
- g_array.c -- tracking digraphs
- kvs_ht.c -- key-value store (hash-table)
- ctrl_sm.c -- control state machine (for consistency of control data)
- client_listener.c -- listening for incoming client requests
./include -- header files
./loggp -- code for computing LogGP params (based on Netgauge [2])
./net -- network module
- ibv -- InfiniBand verbs
- tcp -- TCP sockets
./test -- scripts
./utils
- ft-digraph -- fault tolerant digraph library
- rbtree -- red-black Trees implementation (Linux)
./tla -- TLA+[3] specifcation and proofs

Avoid 200ms delay on TCP ([reduce TCP RTO on Linux] (http://unix.stackexchange.com/questions/210367/changing-the-tcp-rto-value-in-linux))

By default, the min RTO on Linux is ~200ms. To see the RTO values for open TCP connections use the following command:

ss -i

The RTO can be changed only per route; e.g.,

$ ip route
10.0.2.0/24 dev eth0  proto kernel  scope link  src 10.0.2.15 
169.254.0.0/16 dev eth0  scope link  metric 1002 
default via 10.0.2.2 dev eth0
$ ip route change default via 10.0.2.2 dev eth0 rto_min 5ms
$ ip route
10.0.2.0/24 dev eth0  proto kernel  scope link  src 10.0.2.15 
169.254.0.0/16 dev eth0  scope link  metric 1002 
default via 10.0.2.2 dev eth0  rto_min lock 5ms

References

[1] M. Poke, T. Hoefler, and C. W. Glass: Allconcur: Leaderless concurrent atomic broadcast (extended version), CoRR, abs/1608.05866, 2016.

[2] T. Hoefler, A. Lichei and W. Rehm: Low-Overhead LogGP Parameter Assessment for Modern Interconnection Networks, IEEE IPDPS 2007

[3] L. Lamport: Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002

kuwze / allconcur Goto Github PK

allconcur's Introduction

AllConcur

Algorithm for LeaderLess CONCURrent atomic broadcast

Brief

Building benchmarks

Prerequisites

Generate Makefile

Build

Running benchmarks

Directly start an AllConcur server

Start an AllConcur server through MPI

Folder overview

Avoid 200ms delay on TCP ([reduce TCP RTO on Linux] (http://unix.stackexchange.com/questions/210367/changing-the-tcp-rto-value-in-linux))

References

allconcur's People

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent