Giter Site home page Giter Site logo

mloporchio / ethereumskipindex Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 4.13 MB

Implementation of BF skip indexes for the Ethereum blockchain.

Makefile 0.05% Java 8.27% Shell 0.58% Jupyter Notebook 91.10%
blockchain bloom-filter cryptography data-structures ethereum indexing

ethereumskipindex's Introduction

EthereumSkipIndex

Description

This repository includes all code used to produce the results presented in [1]. The paper presents the BF skip index, a data structure based on Bloom filters [2] that can be used for answering inter-block queries on blockchains efficiently.

Overview

The BF skip index was implemented taking the Ethereum blockchain as a reference. Indeed, each Ethereum block already contains a Bloom filter within its header, in order to provide a compact representation of the events raised by the transactions within the block. The experiments proposed in the article (and implemented in this repository) serve a dual purpose.

  1. First, we conducted an historical analysis of events within the first 15 million blocks of the Ethereum blockchain. The goal was to evaluate the feasibility of using default logsBloom filters for building a BF skip index.
  2. Then, we tested our implementation of BF skip indexes and the corresponding algorithm for finding the first occurrence of an event using event data from the CryptoKitties Core smart contract (which can be viewed here).

Data

Due to space constraints, all data needed to reproduce the results is stored in this Zenodo repository. Before running the experiments, all downloaded files should be placed in the data folder of this GitHub repository.

The linked Zenodo repository also includes a detailed description of the data set and its structure.

Documentation

The entire project was implemented using Java and Python. Specifically, all code for constructing Bloom filters, BF skip indexes, and for the simulation of search algorithms was written in Java. Python was used for query generation and result analysis. In particular, Jupyter Notebooks were used to produce and visualize the results interactively.

Requirements

The code was tested using the following software and libraries, which are required for reproducing all experiments.

  • Oracle Java JDK (v. 19.0.1)
  • Python (v. 3.8.9) with the following libraries:
    • Jupyter Notebook (v. 6.5.2)
    • Pandas (v. 1.5.2)
    • NumPy (v. 1.23.5)
    • Matplotlib (v. 3.6.2)
    • SciPy (v. 1.9.3)

The Java implementation depends on the following libraries (already included in the lib folder).

  • Google Guava (v. 31.1)
  • LevelDB JNI (v. 1.8)

Hardware

The experiments have been executed on the following hardware.

  • The Java code has been tested on a machine running Ubuntu Linux, with an 8-core Intel Xeon 5218 CPU @ 2.3 GHz and 256 GB of RAM.

  • The data analysis with Jupyter Notebooks has been carried out on an Apple MacBook Air with a dual-core Intel i7 CPU @ 1.7 GHz and 8 GB of RAM.

Important: to run the Java simulations we recommend at least 150 GB of free disk space.

How to compile

To build all Java classes with ease, you can use the supplied makefile. Just open a terminal in the current folder and type make.

If the make utility is not available in your system, typing the following command from the root directory of the repository should be sufficient.

javac -cp ".:./lib/*" src/skip/*.java -d bin/

Javadoc

To generate Javadoc for the source code, you can use the supplied makefile. Type make doc and the documentation will be placed in the doc folder of the repository. If make is not available, type the following command:

javadoc -cp ".:./lib/*" src/skip/*.java -d doc/

How to run

This section contains the instructions needed to reproduce the experiments presented in the paper. The results of our experiments are reported in Sections 8.1, 8.2 and 8.3 of the paper. The suggested order for running the experiments aligns with the order in which they are presented in the paper.

  1. Important. First, make sure you have all the downloaded files from the Zenodo repository in the data folder. Also, make sure that you have extracted the required compressed files as described here.

  2. Filter analysis. To obtain all results of Section 8.1, open the filters.ipynb notebook and execute all cells. This should create 4 output plots (i.e., Figure 5 in the paper) that will be placed in the pictures folder. The plots are as follows.

    File Description
    pictures/dist_ones.pdf Plot of Figure 5(a)
    pictures/temporal_ones.pdf Plot of Figure 5(b)
    pictures/dist_keys.pdf Plot of Figure 5(c)
    pictures/temporal_keys.pdf Plot of Figure 5(d)
  3. Index analysis. To obtain all results of Section 8.2, open the skip.ipynb notebook and execute all cells. This should create 5 output plots and 2 CSV files containing queries for the simulation. Plots will be placed in the pictures, while CSV files will be saved in the data directory.

    File Description
    pictures/bf-sparse.pdf Plot of Figure 6(a)
    pictures/bf-normal.pdf Plot of Figure 6(b)
    pictures/bf-saturated.pdf Plot of Figure 6(c)
    pictures/cryptokitties_frequency_cumul.pdf Plot of Figure 7(a)
    pictures/cryptokitties_frequency_perc.pdf Plot of Figure 7(b)
    data/queries_birth.csv Data set of queries for the Birth event
    data/queries_transfer.csv Data set of queries for the Transfer event
  4. Query analysis. Once the query data sets have been generated, you can execute the following commands starting from the main directory of the repository. Notice that the execution of the Bash scripts may take some time, as they perform intensive computations (e.g., index construction and multiple query simulations) on a data set of 1 million Ethereum blocks.

    1. Run build_filters.sh to build the Bloom filters for the entire data set.
    2. Run build_index.sh to construct the BF skip index.
    3. Run build_storage.sh to construct the event storage database.
    4. Run test_query.sh to launch the query simulation procedure.
    5. Open the query.ipynb notebook and execute all cells. This will analyze the results of the previous steps.

    These steps should create the following output files and directories. Note that the BF skip indexes of all blocks are stored in LevelDB key-value databases. The four plots created in the pictures constitute the content of Figure 8.

    File Description
    data/filters_8K Binary file containing all Bloom filters
    data/filters_8K_m Binary file containing all modified Bloom filters
    data/index_8K_7 Directory of the BF skip index LevelDB database
    data/index_8K_7_m Directory of the BF skip index LevelDB database (with modified filters)
    data/storage Directory of the event storage LevelDB database
    data/queries_birth_res.csv Results of queries for the Birth event
    data/queries_birth_res_m.csv Results of queries for the Birth event (with modified filters)
    data/queries_transfer_res.csv Results of queries for the Transfer event
    data/queries_transfer_res_m.csv Results of queries for the Transfer event (with modified filters)
    pictures/query_birth.pdf Plot of Figure 8(a)
    pictures/query_transfer.pdf Plot of Figure 8(b)
    pictures/query_birth_m.pdf Plot of Figure 8(c)
    pictures/query_transfer_m.pdf Plot of Figure 8(d)
  5. Inter-block time. The time.ipynb notebook contains the calculations for deriving the average inter-block time discussed in Section 8.2.

References

  1. Loporchio, Matteo et al. "Skip index: supporting efficient inter-block queries and query authentication on the blockchain." (2023).
  2. Bloom, Burton H. "Space/time trade-offs in hash coding with allowable errors." Communications of the ACM 13.7 (1970): 422-426.
  3. Wood, Gavin. "Ethereum: A secure decentralised generalised transaction ledger." Ethereum project yellow paper 151.2014 (2014): 1-32.

ethereumskipindex's People

Contributors

mloporchio avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.