paritytech / substrate-telemetry Goto Github PK

Polkadot Telemetry service

License: GNU General Public License v3.0

TypeScript 29.85% HTML 0.10% CSS 4.74% Dockerfile 0.31% Rust 64.28% Shell 0.12% JavaScript 0.60%

substrate-telemetry's Introduction

Polkadot Telemetry

Overview

This repository contains the backend ingestion server for Substrate Telemetry (which itself is comprised of two binaries; telemetry_shard and telemetry_core) as well as the Frontend you typically see running at telemetry.polkadot.io.

The backend is a Rust project and the frontend is React/Typescript project.

Substrate based nodes can be connected to an arbitrary Telemetry backend using the --telemetry-url (see below for more complete instructions on how to get this up and running).

Messages

Depending on the configured verbosity, substrate nodes will send different types of messages to the Telemetry server. Verbosity level 0 is sufficient to provide the Telemetry server with almost all of the node information needed. Using this verbosity level will lead to the substrate node sending the following message types to Telemetry:

system.connected
system.interval
block.import
notify.finalized

Increasing the verbosity level to 1 will lead to additional "consensus info" messages being sent, one of which has the identifier:

afg.authority_set

Which we use to populate the "validator address" field if applicable.

Increasing the verbosity level beyond 1 is unnecessary, and will not result in any additional messages that Telemetry can handle (but other metric gathering systems might find them useful).

Getting Started

To run the backend, you will need cargo to build the binary. We recommend using rustup.

To run the frontend make sure to grab the latest stable version of node and install dependencies before doing anything:

nvm install stable
(cd frontend && npm install)

Terminal 1 & 2 - Backend

Build the backend binaries by running the following:

cd backend
cargo build --release

And then, in two different terminals, run:

./target/release/telemetry_core

and

./target/release/telemetry_shard

Use --help on either binary to see the available options.

By default, telemetry_core will listen on 127.0.0.1:8000, and telemetry_shard will listen on 127.0.0.1:8001, and expect the telemetry_core to be listening on its default address. To listen on different addresses, use the --listen option on either binary, for example --listen 0.0.0.0:8000. The telemetry_shard also needs to be told where the core is, so if the core is configured with --listen 127.0.0.1:9090, remember to pass --core 127.0.0.1:9090 to the shard, too.

Terminal 3 - Frontend

cd frontend
npm install
npm run start

Once this is running, you'll be able to navigate to http://localhost:3000 to view the UI.

Terminal 4 - Node

Follow up installation instructions from the Polkadot repo

If you started the backend binaries with their default arguments, you can connect a node to the shard by running:

polkadot --dev --telemetry-url 'ws://localhost:8001/submit 0'

Note: The "0" at the end of the URL is a verbosity level, and not part of the URL itself. Verbosity levels range from 0-9, with 0 denoting the lowest verbosity. The URL and this verbosity level are parts of a single argument and must therefore be surrounded in quotes (as seen above) in order to be treated as such by your shell.

Docker

Building images

To build the backend docker image, navigate into the backend folder of this repository and run:

docker build -t parity/substrate-telemetry-backend .

The backend image contains both the telemetry_core and telemetry_shard binaries.

To build the frontend docker image, navigate into the frontend folder and run:

docker build -t parity/substrate-telemetry-frontend .

Run the backend and frontend using `docker-compose`

The easiest way to run the backend and frontend images is to use docker-compose. To do this, run docker-compose up in the root of this repository to build and run the images. Once running, you can view the UI by navigating a browser to http://localhost:3000.

To connect a substrate node and have it send telemetry to this running instance, you have to tell it where to send telemetry by appending the argument --telemetry-url 'ws://localhost:8001/submit 0' (see "Terminal 4 - Node" above).

Run the backend and frontend using `docker`

If you'd like to get things running manually using Docker, you can do the following. This assumes that you've built the images as per the above, and have two images named parity/substrate-telemetry-backend and parity/substrate-telemetry-frontend.

Create a new shared network so that the various containers can communicate with eachother:
```
docker network create telemetry
```

Start up the backend core process. We expose port 8000 so that a UI running in a host browser can connect to the /feed endpoint.

docker run --rm -it --network=telemetry \
    --name backend-core \
    -p 8000:8000 \
    --read-only \
    parity/substrate-telemetry-backend \
    telemetry_core -l 0.0.0.0:8000

In another terminal, start up the backend shard process. We tell it where it can reach the core to send messages (possible because it has been started on the same network), and we listen on and expose port 8001 so that nodes running in the host can connect and send telemetry to it.
```
docker run --rm -it --network=telemetry \
    --name backend-shard \
    -p 8001:8001 \
    --read-only \
    parity/substrate-telemetry-backend \
    telemetry_shard -l 0.0.0.0:8001 -c http://backend-core:8000/shard_submit
```
In another terminal, start up the frontend server. We pass a SUBSTRATE_TELEMETRY_URL env var to tell the UI how to connect to the core process to receive telemetry. This is relative to the host machine, since that is where the browser and UI will be running.
```
docker run --rm -it --network=telemetry \
    --name frontend \
    -p 3000:8000 \
    -e SUBSTRATE_TELEMETRY_URL=ws://localhost:8000/feed \
    parity/substrate-telemetry-frontend
```
NOTE: Here we used SUBSTRATE_TELEMETRY_URL=ws://localhost:8000/feed. This will work if you test with everything running locally on your machine but NOT if your backend runs on a remote server. Keep in mind that the frontend docker image is serving a static site running your browser. The SUBSTRATE_TELEMETRY_URL is the WebSocket url that your browser will use to reach the backend. Say your backend runs on a remote server at foo.example.com, you will need to set the IP/url accordingly in SUBSTRATE_TELEMETRY_URL (in this case, to ws://foo.example.com/feed).

NOTE: Running the frontend container in read-only mode reduces attack surface that could be used to exploit a container. It requires however a little more effort and mounting additional volumes as shown below:
```
docker run --rm -it -p 80:8000 --name frontend \
   -e SUBSTRATE_TELEMETRY_URL=ws://localhost:8000/feed \
   --tmpfs /var/cache/nginx:uid=101,gid=101 \
   --tmpfs /var/run:uid=101,gid=101 \
   --tmpfs /app/tmp:uid=101,gid=101 \
   --read-only \
   parity/substrate-telemetry-frontend
```

With these running, you'll be able to navigate to http://localhost:3000 to view the UI. If you'd like to connect a node and have it send telemetry to your running shard, you can run the following:

docker run --rm -it --network=telemetry \
  --name substrate \
  -p 9944:9944 \
  chevdor/substrate \
  substrate --dev --telemetry-url 'ws://backend-shard:8001/submit 0'

You should now see your node showing up in your local telemetry frontend:

Deployment

This section covers the internal deployment of Substrate Telemetry to our staging and live environments.

Deployment to staging

Every time new code is merged to master, a new version of telemetry will be automatically built and deployed to our staging environment, so there is nothing that you need to do. Roughly what will happen is:

An image tag will be generated that looks like $CI_COMMIT_SHORT_SHA-beta, for example 224b1fae-beta.
Docker images for the frontend and backend will be pushed to the docker repo (see https://hub.docker.com/r/parity/substrate-telemetry-backend/tags?page=1&ordering=last_updated and https://hub.docker.com/r/parity/substrate-telemetry-frontend/tags?page=1&ordering=last_updated).
A deployment to the staging environment will be performed using these images. Go to https://gitlab.parity.io/parity/substrate-telemetry/-/pipelines to inspect the progress of such deployments.

Deployment to live

Once we're happy with things in staging, we can do a deployment to live as follows:

Ensure that the PRs you'd like to deploy are merged to master.
Tag the commit on master that you'd like to deploy with the form v1.0-a1b2c3d.
- The version number (1.0 here) should just be incremented from whatever the latest version found using git tag is. We don't use semantic versioning or anything like that; this is just a dumb "increment version number" approach so that we can see clearly what we've deployed to live and in what order.
- The suffix is a short git commit hash (which can be generated with git rev-parse --short HEAD), just so that it's really easy to relate the built docker images back to the corresponding code.
Pushing the tag (eg git push origin v1.0-a1b2c3d) will kick off the deployment process, which in this case will also lead to new docker images being built. You can view the progress at https://gitlab.parity.io/parity/substrate-telemetry/-/pipelines.
Once a deployment to staging has been successful, run whatever tests you need against the staging deployment to convince yourself that you're happy with it.
Visit the CI/CD pipelines page again (URl above) and click the "play" button on the "Deploy-production" stage to perform the deployment to live.
Confirm that things are working once the deployment has finished by visiting https://telemetry.polkadot.io/.

Rolling back to a previous deployment

If something goes wrong running the above, we can roll back the deployment to live as follows.

Decide what image tag you'd like to roll back to. Go to https://hub.docker.com/r/parity/substrate-telemetry-backend/tags?page=1&ordering=last_updated and have a look at the available tags (eg v1.0-a1b2c3d) to select one you'd like. You can cross reference this with the tags available using git tag in the repository to help see which tags correspond to which code changes.
Navigate to https://gitlab.parity.io/parity/substrate-telemetry/-/pipelines/new.
Add a variable called FORCE_DEPLOY with the value true.
Add a variable called FORCE_DOCKER_TAG with a value corresponding to the tag you want to deploy, eg v1.0-a1b2c3d. Images must exist already for this tag.
Hit 'Run Pipeline'. As above, a deployment to staging will be carried out, and if you're happy with that, you can hit the "play" button on the "Deploy-production" stage to perform the deployment to live.
Confirm that things are working once the deployment has finished by visiting https://telemetry.polkadot.io/.

substrate-telemetry's People

Contributors

Stargazers

Watchers

Forkers

ltfschoen chevdor chainpool chainx-org akru jampunkramadhan mikolajroszak manoky warrenween gterzian siman stefie kiltprotocol bdevux mischi jonnycrunch bwhm cennznet woeom facundomarguello-goldofir turnetwork susytech polkadotpioneer vimukthi-git satellitex alexzhenwang developerfred chrisxk yanganto gantree-io woss branciard totem-tech polkascan lsaether andskur xutdns jorellanaf yangmiok bifrost-finance f3joule uni-arts-chain vsatyanaveen datahighway-dhx luguslabs web3delta filestock playzero jshuadvd web3capital spiderdao ctt-block-chain mmyyrroonn mstroehle bl0ckchainmaster asymmetric christofon joystream zentachainadmin btcgoose redeyespb mrq1911 zentachain polka-defi dergudzon mad0perator bhonetwork deepshardingdao dedsec-9 couragebc floxter cryptob3auty rose2161 gilescope 0o0vowov0o0 metacatledger snapr-org kenman79 bossayan1978 subqns dev-master1004 yaroslvozl synternet lucylow wimdows-nl cybai diamondnetwork quinndiggity kishansagathiya threefoldtecharchive hercules-network pinkdiamond1 octopus-network ironoa papadritta benjamin-martijn humanode-network tidelabs decentration rasata

substrate-telemetry's Issues

Allow defining a list of watched nodes

As the list grow, telemetry is nice to monitor some node of interest.
As a user, I would like to define some 'watched' nodes that stay pinned on top of the list

Calculate average block time

Add bandwidth info

Added to Substrate in: paritytech/substrate#1551

Show node latency

Frontend losing pins

There is nothing to be fixed here, waiting for Polkadot nodes to include latest Substrate with paritytech/substrate#757.

Real time seconds since last block

Missing test suite

Propose to add test suite similar to that used on polkadot-js/apps

Calculate block propagation time per node

Include history of block propagation times per node

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please delete the greenkeeper/initial branch in this repository, and then remove and re-add this repository to the Greenkeeper App’s white list on Github. You'll find this list on your repo or organization’s settings page, under Installed GitHub Apps.

Question: How to integrate new RPC endpoint

Looking at https://github.com/paritytech/substrate-telemetry/pull/119/files, however that seems to be using the websocket feed, and I haven't found the code that deals with RPC queries.

I'm looking to integrate paritytech/substrate#1884 into the UI...

Would anyone have some pointers? @maciejhirsz @cmichi

Implementation versions for ChainX

When I view the ChainX tabs in Telemetry, the "Implementation" version tabs have values as shown in below screenshots (i.e. ? or ? CARGO_PKG....):

nominate: need to set the "How many DOTs to nominate for this target"

I think, nominate(target) need to set the "How many DOTs to nominate for this target".
the target of nominate, not only one, & all DOTs of this account.

Group nodes by origin

New import format:

{"msg":"block.import","level":"INFO","ts":"2018-07-13T17:22:05.789520968+02:00","origin":"NetworkInitialSync","best":"a7db922a8f98d182fedc3bb4649fb96cf7e1825ef300137009624bfa5c6d4fed","height":20784}

We should use origin to sort nodes. Additionally, if origin is NetworkInitialSync the node should be held in a limbo somewhere...

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

Fix FE tests

We've missed on CI after moving the repo to paritytech org. At some point during that migration jest on FE stopped working, need to fix that.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

Best block is not the best

Downgrade best block on chain if node that produced it left

When removing a Node, check if that node is at highest block, if so reset the best block down to highest Node after removal.

Pitfall to avoid: this can be O(n) for every removal, the search through the list should terminate on first node that matches current best block.

Display finalized block number

Would be useful to see how GRANDPA is progressing.
Longer-term it would be awesome to have telemetry/visualization of the GRANDPA voting.

Geolocate nodes

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

Frontend does not build

I tried following the README using node 10 and 9.6 for #29.
No problem with the backend but starting the frontend fails:

$ cd packages/frontend/
12:15 will@KI-2773 frontend (will-docker)*$ yarn build
yarn run v1.7.0
$ react-scripts-ts build
Creating an optimized production build...
Starting type checking and linting service...
Using 1 worker with 2048MB memory limit
ts-loader: Using [email protected] and /Users/will/.../dotstats/packages/frontend/tsconfig.prod.json
Failed to compile.

/Users/will/.../dotstats/node_modules/@types/react-dom/index.d.ts
(19,39): Cannot use namespace 'ReactInstance' as a type.


error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

Map shows incorrect location of Polkadot Node

With reference to #7, since I'm in the Berlin office I'm running a Polkadot Node that has uses Berlin slang in its name, but when I toggle the map it says it's located at Frankfurt am Main.

Connections aren't properly pruned?

From live server logs:

[System] 6788 open telemetry connections; 4768 open feed connections

There is no way we have 6788 connected nodes and I doubt we have that many people looking at telemetry at the same time.

Not particularly critical as the server is still handling traffic well with resources being well in control (360mb RAM, CPU use peaks at 4%, mostly 0)

Localization info lost

Windows nodes don't seem to show up

I confirmed my windows node is connecting successfully to a mock telemetry server (just a javascript ws server logging messages) by providing the --telemetry-url ws://localhost:9000/
Node still doesn't show up on https://telemetry.polkadot.io

A punished validator node still shows as a validator node in telemetry

Scenario

Create a full node using docker.
Create an account.
Restart the full node as a validator node with the key.
Stake the account.
Wait till it becomes a validator.

Observation

After sometimes check the validators section on staking overview.
You may find the newly added validator is no longer presents.
Telemetry will still shows the node as a validator node.

Note

Newly added validator node kicked out from the network after 15 or 30 minutes.
This happens most of the time

Add finalized block info

From finalized_height and finalized_hash fields on telemetry data.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

Validator identicons not displayed

https://telemetry.polkadot.io/#/Alexander

Allow setting a limit on the number of displayed node

Map to WebGL Globe for no reason

https://zhxnlai.github.io/react-webgl-globe-basic-example/

Stable sort the chain tabs

They keep flickering when two chains have the same amount of nodes on them.

"% CPU Use" column displays 100x higher value than "% CPU" shown in macOS Activity Monitor and Memory Use is about 1.4x higher

On macOS 10.13.6, with rustc --version rustc 1.29.0 (aa3ca1994 2018-09-11) and ./target/debug/polkadot --version polkadot 0.3.0-d12426b6-x86_64-macos

I started syncing my node with:

./target/debug/polkadot --validator \
  --chain krummelanke \
  --execution both \
  --name "AUSSIE STAKE! 🔥🔥🔥" \
  --port 30333 \
  --pruning 256 \
  --rpc-port 9933 \
  --telemetry-url ws://telemetry.polkadot.io:1024 \
  --ws-port 9944

Then when I went to https://telemetry.polkadot.io/# the "% CPU Use" column displayed ~19000, whereas Activity Monitor on my macOS said the process name "polkadot" was only using ~190, so it appears to be showing a value 100x larger than it should be. See screenshot below.

CPU Use (Polkadot Dotstats vs macOS Activity Monitor)

Memory Use (Polkadot Dotstats vs macOS Activity Monitor)

Memory Use is always shown about 1.4x higher on Polkadot Dotstats vs macOS Activity Monitor

Customize columns in table view

Allow to enable or disable displayed columns.
Disable last block time by default.

Weird behavior when public addresses for nodes conflict

Need to handle that on backend side (give the new node a different id)

telemetry: need to show the account address, if he is a validator.

I think, the telemetry need to show the account address, if he is a validator.

Unable to horizontal scroll to view all Substrate and Polkadot testnet chain tabs

Screenshot showing Github logo overlapping the testnet chain tabs

Add the network_id field

After paritytech/substrate#1835, nodes report their network identity in the network_id field at initialization; this should be displayed on the webpage.

Nodes on map shift when resizing the window

The pixel position and the background position get vertically misaligned. Might be a good first issue if you are interested @yjkimjunior.

Move all text filter logic to the Filter component

To further reduce the size of Chain component and make the interactions between components more isolated.

We could (should?) also put the Filter directly into List and Map instead of having it hanging in Chain.

Click to copy validator address

Bonus points for a tooltip view that's nicer than using the default title on-hovers.

Errors building dotstats common package

I've modified the common package to export another utility function, but I need to build the common package in order for other files to import from it.

When I try and build the common package by running yarn build:common, I get the following output:

$ yarn build:common
yarn run v1.7.0
$ tsc -p packages/common
error TS5055: Cannot write file '/Users/Me/code/blockchain/clones/paritytech/dotstats/packages/common/build/feed.d.ts' because it would overwrite input file.

error TS5055: Cannot write file '/Users/Me/code/blockchain/clones/paritytech/dotstats/packages/common/build/helpers.d.ts' because it would overwrite input file.

error TS5055: Cannot write file '/Users/Me/code/blockchain/clones/paritytech/dotstats/packages/common/build/id.d.ts' because it would overwrite input file.

error TS5055: Cannot write file '/Users/Me/code/blockchain/clones/paritytech/dotstats/packages/common/build/index.d.ts' because it would overwrite input file.

error TS5055: Cannot write file '/Users/Me/code/blockchain/clones/paritytech/dotstats/packages/common/build/types.d.ts' because it would overwrite input file.

error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

If I remove "declaration": true from packages/common/tsconfig.json no errors are produced but I still can't import the new exported functions. I also tried adding the following to the tsconfig.json (if this makes any sense) but it still produces the errors:

"exclude": [
  "build/**/*.d.ts"
]

Expose the public key of each Polkadot node to the frontend UI

I found when doing this PR #27 that the Node ID changes. Could we expose the public key of the node to the frontend so that I can store this in the state and local storage instead?

SS58 address incorrect for other networks

The SS58 addresses for validators for the BBQ Birch network should start with "7" but starts with "5". This implies that currently the network-prefix is not taken into account.

https://telemetry.polkadot.io/#/BBQ%20Birch

Input validation

Limit node name to 64 utf16 codepoints, or unicode characters.
Make sure all values make sense (hashes are actually hashes, number values are correct numbers).
Have some heuristics to kick misbehaving nodes.

Include Polkadot theme colors

Emily from Web3 just sent me the finalized color palette with the official Polkadot colors.

Before starting we need to agree on one of the font-color/ background combinations on the second page of the pdf.

20190212_Polkadot_Color_Palette.pdf

Better `timeDiff`

The server sends a timestamp to the client every 10 seconds, the difference between local time and server time is then used to provide better values for block time events (particularly the Ns ago).

Instead of re-calculating the value based on most recent server time, given enough data has been sent to the client, the client should store up to 10 most recent timeDiffs, filter extremes (e.g.: ignore 2 highest and 2 lowest values) and then use the average of the remaining values.

Add CPU and Memory stats coming from Telemetry

Average block time is always null and so is not appearing in UI

If I run a node and its generating blocks, it sends message.payload to the front-end in https://github.com/polkadot-js/dotstats/blob/master/packages/frontend/src/Connection.ts#L102, but in BestBlock the message.payload is [820468, 1532502970369, null], where the last element in the array is the averageBlockTime that's sent from the backend in https://github.com/polkadot-js/dotstats/blob/master/packages/backend/src/Chain.ts#L59, whose default value is null. Should it be changing?

paritytech / substrate-telemetry Goto Github PK

substrate-telemetry's Introduction

Polkadot Telemetry

Overview

Messages

Getting Started

Terminal 1 & 2 - Backend

Terminal 3 - Frontend

Terminal 4 - Node

Docker

Building images

Run the backend and frontend using docker-compose

Run the backend and frontend using docker

Deployment

Deployment to staging

Deployment to live

Rolling back to a previous deployment

substrate-telemetry's People

Contributors

Stargazers

Watchers

Forkers

substrate-telemetry's Issues

Recommend Projects

Recommend Topics

Recommend Org

Run the backend and frontend using `docker-compose`

Run the backend and frontend using `docker`