This repository includes all code used to produce the results presented in [1]. The paper presents the BF skip index, a data structure based on Bloom filters [2] that can be used for answering inter-block queries on blockchains efficiently.
The BF skip index was implemented taking the Ethereum blockchain as a reference. Indeed, each Ethereum block already contains a Bloom filter within its header, in order to provide a compact representation of the events raised by the transactions within the block. The experiments proposed in the article (and implemented in this repository) serve a dual purpose.
- First, we conducted an historical analysis of events within the first 15 million blocks of the Ethereum blockchain. The goal was to evaluate the feasibility of using default
logsBloom
filters for building a BF skip index. - Then, we tested our implementation of BF skip indexes and the corresponding algorithm for finding the first occurrence of an event using event data from the CryptoKitties Core smart contract (which can be viewed here).
Due to space constraints, all data needed to reproduce the results is stored in this Zenodo repository. Before running the experiments, all downloaded files should be placed in the data
folder of this GitHub repository.
The linked Zenodo repository also includes a detailed description of the data set and its structure.
The entire project was implemented using Java and Python. Specifically, all code for constructing Bloom filters, BF skip indexes, and for the simulation of search algorithms was written in Java. Python was used for query generation and result analysis. In particular, Jupyter Notebooks were used to produce and visualize the results interactively.
The code was tested using the following software and libraries, which are required for reproducing all experiments.
- Oracle Java JDK (v. 19.0.1)
- Python (v. 3.8.9) with the following libraries:
- Jupyter Notebook (v. 6.5.2)
- Pandas (v. 1.5.2)
- NumPy (v. 1.23.5)
- Matplotlib (v. 3.6.2)
- SciPy (v. 1.9.3)
The Java implementation depends on the following libraries (already included in the lib
folder).
- Google Guava (v. 31.1)
- LevelDB JNI (v. 1.8)
The experiments have been executed on the following hardware.
-
The Java code has been tested on a machine running Ubuntu Linux, with an 8-core Intel Xeon 5218 CPU @ 2.3 GHz and 256 GB of RAM.
-
The data analysis with Jupyter Notebooks has been carried out on an Apple MacBook Air with a dual-core Intel i7 CPU @ 1.7 GHz and 8 GB of RAM.
Important: to run the Java simulations we recommend at least 150 GB of free disk space.
To build all Java classes with ease, you can use the supplied makefile. Just open a terminal in the current folder and type make
.
If the make
utility is not available in your system, typing the following command from the root directory of the repository should be sufficient.
javac -cp ".:./lib/*" src/skip/*.java -d bin/
To generate Javadoc for the source code, you can use the supplied makefile. Type make doc
and the documentation will be placed in the doc
folder of the repository. If make
is not available, type the following command:
javadoc -cp ".:./lib/*" src/skip/*.java -d doc/
This section contains the instructions needed to reproduce the experiments presented in the paper. The results of our experiments are reported in Sections 8.1, 8.2 and 8.3 of the paper. The suggested order for running the experiments aligns with the order in which they are presented in the paper.
-
Important. First, make sure you have all the downloaded files from the Zenodo repository in the
data
folder. Also, make sure that you have extracted the required compressed files as described here. -
Filter analysis. To obtain all results of Section 8.1, open the
filters.ipynb
notebook and execute all cells. This should create 4 output plots (i.e., Figure 5 in the paper) that will be placed in thepictures
folder. The plots are as follows.File Description pictures/dist_ones.pdf
Plot of Figure 5(a) pictures/temporal_ones.pdf
Plot of Figure 5(b) pictures/dist_keys.pdf
Plot of Figure 5(c) pictures/temporal_keys.pdf
Plot of Figure 5(d) -
Index analysis. To obtain all results of Section 8.2, open the
skip.ipynb
notebook and execute all cells. This should create 5 output plots and 2 CSV files containing queries for the simulation. Plots will be placed in thepictures
, while CSV files will be saved in thedata
directory.File Description pictures/bf-sparse.pdf
Plot of Figure 6(a) pictures/bf-normal.pdf
Plot of Figure 6(b) pictures/bf-saturated.pdf
Plot of Figure 6(c) pictures/cryptokitties_frequency_cumul.pdf
Plot of Figure 7(a) pictures/cryptokitties_frequency_perc.pdf
Plot of Figure 7(b) data/queries_birth.csv
Data set of queries for the Birth event data/queries_transfer.csv
Data set of queries for the Transfer event -
Query analysis. Once the query data sets have been generated, you can execute the following commands starting from the main directory of the repository. Notice that the execution of the Bash scripts may take some time, as they perform intensive computations (e.g., index construction and multiple query simulations) on a data set of 1 million Ethereum blocks.
- Run
build_filters.sh
to build the Bloom filters for the entire data set. - Run
build_index.sh
to construct the BF skip index. - Run
build_storage.sh
to construct the event storage database. - Run
test_query.sh
to launch the query simulation procedure. - Open the
query.ipynb
notebook and execute all cells. This will analyze the results of the previous steps.
These steps should create the following output files and directories. Note that the BF skip indexes of all blocks are stored in LevelDB key-value databases. The four plots created in the
pictures
constitute the content of Figure 8.File Description data/filters_8K
Binary file containing all Bloom filters data/filters_8K_m
Binary file containing all modified Bloom filters data/index_8K_7
Directory of the BF skip index LevelDB database data/index_8K_7_m
Directory of the BF skip index LevelDB database (with modified filters) data/storage
Directory of the event storage LevelDB database data/queries_birth_res.csv
Results of queries for the Birth event data/queries_birth_res_m.csv
Results of queries for the Birth event (with modified filters) data/queries_transfer_res.csv
Results of queries for the Transfer event data/queries_transfer_res_m.csv
Results of queries for the Transfer event (with modified filters) pictures/query_birth.pdf
Plot of Figure 8(a) pictures/query_transfer.pdf
Plot of Figure 8(b) pictures/query_birth_m.pdf
Plot of Figure 8(c) pictures/query_transfer_m.pdf
Plot of Figure 8(d) - Run
-
Inter-block time. The
time.ipynb
notebook contains the calculations for deriving the average inter-block time discussed in Section 8.2.
- Loporchio, Matteo et al. "Skip index: supporting efficient inter-block queries and query authentication on the blockchain." (2023).
- Bloom, Burton H. "Space/time trade-offs in hash coding with allowable errors." Communications of the ACM 13.7 (1970): 422-426.
- Wood, Gavin. "Ethereum: A secure decentralised generalised transaction ledger." Ethereum project yellow paper 151.2014 (2014): 1-32.