pinax-network / subtivity-substreams Goto Github PK

View Code? Open in Web Editor NEW

0.0 6.0 0.0 242 KB

Subtivity Substreams - powered by Pinax

Home Page: https://subtivity-substreams.vercel.app

License: MIT License

Rust 80.25% Makefile 19.75%

data-science substreams thegraph

subtivity-substreams's Introduction

`Subtivity` Substreams

Block level activity per for each supported chains powered by Pinax.

Data

Transaction Count
Action Count (Events)
UAW (Unique Active Wallets)

Chains

Map Outputs

`map_block_stats`

{
  "transactionTraces": "213",
  "traceCalls": "1093",
  "uaw": [
    "4239a4e3a00a5282b6df7c19bd16cbf761b2c21f",
    "b18ccf69940177f3ec62920ddb2a08ef7cb16e8f",
    "603288a144fabf14a6c9806e9baadc9dbc1e9fd6",
    "0555262d2f4889522c3d7c0762d3c92e2ce817d1",
    "dc7bda95b512f7b9feb17566b80fa6bca5bb1693",
    "5c3efbafc55565d66312235428daf4988a4e41dc",
    ...
  ]
}

Quickstart

$ make
$ make run
$ make gui

Graph

graph TD;
  map_block_stats[map: map_block_stats]
  sf.ethereum.type.v2.Block[source: sf.ethereum.type.v2.Block] --> map_block_stats
  sf.antelope.type.v1.Block[source: sf.antelope.type.v1.Block] --> map_block_stats
  sf.near.type.v1.Block[source: sf.near.type.v1.Block] --> map_block_stats
  zklend.starknet.type.v1.Block[source: zklend.starknet.type.v1.Block] --> map_block_stats
  graph_out[map: graph_out]
  map_block_stats --> graph_out

Modules

Package name: subtivity_ethereum
Version: v0.4.0
Doc: Subtivity for Ethereum
Modules:
----
Name: map_block_stats
Initial block: 0
Kind: map
Output Type: proto:subtivity.v1.BlockStats
Hash: 93725ab06a11557d2f157350311fb73d3ac7437e

Name: graph_out
Initial block: 0
Kind: map
Output Type: proto:sf.substreams.sink.entity.v1.EntityChanges
Hash: e7d70cf4655838fa71eb62869bb34356714da241

subtivity-substreams's People

Contributors

Watchers

subtivity-substreams's Issues

Include `graph_out` + `schema.sql` to the Subtivity Substreams

Include graph_out + schema.sql to the Subtivity Substreams

Scope

include graph_out map module
include schema.sql to support Clickhouse Sink
Drop extras (no longer required)

We could DROP the:

Grafana
Prometheus
KV
store_daw
sink

Since our Graph out + Clickhouse Sink, would most likely be able to solve all that

The schema.sql is pretty straight forward

transaction_traces: number
trace_calls: number
uaw: number[]

Create `store` for total smart contracts deployed

smart contracts deployments (setcode)

EPIC - Design Concepts

UX Design for the following requirements

Involved Parties

UI/UX @DominicF96
UI Data requirements: @DenisCarriere @YaroShkvorets
Project Owner @chillsauce

Requirements

Substream Data requirements

Transactions
- Unique Transaction ID's
- Actions (Events)
UAW (Unique Active Wallets) daily aggregate

Chart comparison

By Chain Selection (ex: ETH vs. Polygon)
By Contract (ex: Uniswap vs. PancakeSwap)

The contract schema would be something like:

eth:0x1f9840a85d5aF5bf1D1762F925BDADdC4201F984
bsc:0x0e09fabb73bd3ade0a17ecc321fd13a19e81ce82

Design examples

https://dappradar.com/eos/other/pomelo

https://etherscan.io/

Thoughts on Substream structuring

Put some thought into how we could structure the Substreams, open for discussions.

The parts I see:

Chain specific code
Chain agnostic code
Database output

1. Chain specific code

This part should transform a chain specific block like eth::Block into chain agnostic BlockStats. Proposal for BlockStats:

message BlockStats {
  string chain = 1;                         // the blockchain we are running on
  int64 block_num = 2;                      // block number for which we hold the block stats 
  google.protobuf.Timestamp timestamp = 3;  // timestamp of the block
  string block_id = 4;                      // block id 

  int64 transaction_count = 5;       // number of successfully executed transactions in this block
  int64 event_count = 6;             // number of successfully executed events in this block
  int64 transactions_per_second = 7; // transaction_count / block_time
  int64 events_per_second = 8;       // event_count / block_time

  bool is_first_block = 9;        // true if this is the first block of the chain
  bool is_first_day_block = 10;   // true if this is the first block of the day
  bool is_first_hour_block = 11;  // true if this is the first block of the hour

  repeated string accounts = 12; // list of unique accounts/wallets used in this block
}

This will allow all following maps to function completely chain agnostic. To achieve the full block stats we likely need 2 maps and one store for each chain. It could look like this

map_partial_blocks(<chain>::Block) -> BlockStats - this map parses <chain>::Block and emits a BlockStats including all fields it is able to fill (fields 1-6 and 12)
store_partial_block_stats(BlockStats) - store all partial BlockStats from above by block_num
map_full_block_stats(BlockStats, store_partial_block_stats) - it retrieves the BlockStats from above, looks up the previous BlockStats from the store_partial_block_stats and then fills up the remaining fields in BlockStats (it now has block x and block x-1, so it can now calculate the transactions/events per second as it knows how much time has passed since the last block, it also knows if this block is the first per day/hour)

2. Chain agnostic code

The accumulation and max tps/aps stores can now be completely chain agnostic and don't need to handle any logic regarding block times or similar. Those should be common functions which we can just re-use among all Substreams and they would be responsible for aggregating daily/hourly transaction/event counts and also max transactions/events per second.

3. Database output

We might want to think whether we are putting this into it's own Substream so in case we change stuff on the database or add more outputs we don't need to re-sync the full Substream (because the hash has changed).

This would now basically contain the db_out which is responsible to create database_changes whenever:

the max stores have changed (we only need to write this when there are changes, use Deltas<DeltaInt64> here)
to the accumulated tables when we receive a BlockStats that has is_first_day_block / is_first_hour_block set to true (in this case we know we can now emit the accumulated stats from the previous bucket).
to the last_block table on each block (if is_first_block is true we create a new entry, otherwise we update)

Which parts to combine in a Substream

We could have:

1,2,3 is all part of one Substream (for each chain it's own complete Substream, fewest Substreams, we can share common code from 2 through a library)
1,2,3 are all separate Substreams (more Substreams to maintain, but also most caching potential/fewest re-syncs necessary on changes)
1,2 is combined, and 3 is separate (less caching potential, requires all chains to re-sync if one changes, but avoids having to re-sync in case we want to change the database output function)
1 separate 2,3 combined (avoids having to re-sync if one chain specific code changes, but still requires re-sync if we change the database output)

EPIC - Infrastructure

Services

PostgreSQL
Hasura
Substream to Postgres sink

Relevant decisions

PostgreSQL cluster vs multiple concurrent instances (we only have read access to the data, so instead of a PostgreSQL cluster we could just setup 3 sinks that write concurrently into 3 different PostgreSQL instances serving Hasura)?