Giter Site home page Giter Site logo

gc-delay-study's Introduction

GC-Delay-Study

Extracting the google cloud delay distribution among the nodes

Purpose

Due to the variety of issues that occur when deploying nodes on the cloud, such as scalability and reliability issues, we have conducted a study on the delay in communication between nodes in order to develop a more realistic way to simulate a SkipGraph without having to go through the hassle of deploying SkipNodes on the cloud.

Methodology

Deployment

In order to collect realistic network statistics, the Google Cloud Compute Engine service was used. Instances were deployed in multiple locations distributed all over the globe. Once the instances we deployed, SkipNodes were started and initialized on the instances. The instructions to launching the nodes can be found in the documentation for mass deployment.

Testing

Using the RemoteAccessTool, any one of the nodes in the SkipGraph can be accessed in order to start the testing. You can choose one of two types of pinging, either using ICMP echo requests or RMI calls. Once the testing starts, the tool traverses the graph in order to create a list of all the nodes in the SkipGraph. Once the list is complete, the tool makes every node in the list ping every other node X times. In order to spread the pinging attempts over a longer period, the tool allows the user to split the X pinging attempts into N equally sized chunks.

Logging

The RemoteAccessTool will output a .CSV file that contains all the logs collected from the pinging attempts. The output file looks something like this:

Pinger 21
Pinged Avg Ping StdDev Individual Results
1 48.3565 3.942893322 45 44 45
2 227.36625 4.502733718 225 226 227
3 36.939 4.129016711 33 33 36
4 235.16375 4.20153971 246 236 233
5 98.542 4.169860429 98 99 99

All nodes are identified by their name ID. The pinger is the node that is pinging other nodes. The 'Avg Ping' and the 'StdDev' fields are the average and the standard deviation of all the individual pinging attempts between the Pinger node and the Pinged node, respectively. Following that is the result of every single pinging attempt.

Results

After setting up a network of 32 nodes, ping log collection was carried out. The logs can be found in the logs folder, along with the configuration files in order to replicate the same SkipGraph structure. The data from the logs has been translated to histograms (which can be found in the results folder). Two different types of histograms were created:

  • Histogram of the average RTT delay between every pair of nodes.
  • Histogram of all the recorded RTT delays between a single pair of nodes.

After the histograms were created, different distributions were fitted to see which ones resemble the histogram the best. A normal distribution appeared to be the closest fit for both types of histograms. As for the histogram of the average RTT delay between every pair of nodes, the fitting normal distribution had the following parameters:

  • μ = 159.92 ms
  • σ = 95.93 ms

As for the other type of histogram, the parameters are exactly as in the logs.

gc-delay-study's People

Contributors

shamdan17 avatar yhassanzadeh13 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.