Giter Site home page Giter Site logo

casper-node-launcher's People

Contributors

bors[bot] avatar bradjohnl avatar casperlabs-bors-ng[bot] avatar fraser999 avatar georgepisaltu avatar ihor avatar rafal-ch avatar sacherjj avatar tomvasile avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

casper-node-launcher's Issues

Support "walk-back" error code in new join during staged upgrade scenario

As of 1.5 the launcher behavior for a new node instance runs the highest protocol version (pv) binary found locally instead of the lowest. This is normally desirable behavior. In the special case of a new node having both the current pv binary and a staged next-version pv binary locally present this is undesirable behavior as the network peers running the current pv binary wont handshake with the new node running the higher pv binary. This leaves the new node stranded and unable to join the network (until such a time as the network transitions thru the upgrade naturally). This can be worked around via manual intervention or via the application of some custom scripted behavior, but it is suboptimal / poor operator ux and puts a recurring burden upon rel management and operator support staff.
The desired functionality is as follows: if when a new joining node (defined as a node with no local blocks) attempts to peer to its trusted nodes (defined as the ip's in the config file's known address) all the attempted handshakes fail due to the trusted nodes running a lower protocol version than the new node, then the new joining node should shutdown and emit the exit code that prompts the launcher to look for and run the next lowest pv binary.

Installed binaries don't match config

I didn't had any problems installing the node and executing the commands up until to the point where i want to start the node-launcher service.

I get an error in the log files which is something like.
"Installed binary version (1_0_0) don't match installed configs (1_0_0, 1_1_0, 1_1_2 etc..)
I'm on a ESXi VM but the installed Ubuntu is 20.04.

When i try to curl the debug_file on cnm labs the vm just crashes. But it's not a resource Problem since it has 64 gb of ram and 12 cores.

Doesn't seem like an issue on my end but i could be wrong. Already asked in Discord and Telegram but without any results.

Update node_util.py fix_permissions and check_permissions to file/dir not found error

node_util.py fix_permissions and check_permissions can run without the '/var/lib/casper/casper-node' path.

However, when a user runs one of the above commands and the directory is missing, it shows the following error;

Traceback (most recent call last):
  File "./node_util.py", line 787, in <module>
    NodeUtil()
  File "./node_util.py", line 69, in __init__
    getattr(self, args.command)()
  File "./node_util.py", line 521, in fix_permissions
    for path in self._walk_file_locations():
  File "./node_util.py", line 491, in _walk_file_locations
    for _path in NodeUtil._walk_path(path):
  File "./node_util.py", line 496, in _walk_path
    for p in Path(path).iterdir():
  File "/usr/lib/python3.8/pathlib.py", line 1122, in iterdir
    for name in self._accessor.listdir(self):
FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/casper/casper-node'

Update required to the below effect;

  • This path is needed to be iterated to fix when DB migrations are done with wrong permissions, but it should not be fatal to not exist.
  • The mentioned path is not really needed to fix and check the keys permissions. The command works if you create an empty folder in the same path.

is it syncing?

may i know if the node is syncing or not?

tail log file

"state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.196611Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-08-30T02:11:14.222507Z","level":"INFO","fields":{"message":"new outgoing connection established"},"target":"casper_node::components::network","span":{"addr":"99.81.225.72:35000","consensus_key":"PubKey::Ed25519(5dfd..8a21)","peer_id":"tls:352d..2e46","state":"connected","name":"outgoing"},"spans":[{"addr":"99.81.225.72:35000","consensus_key":"PubKey::Ed25519(5dfd..8a21)","peer_id":"tls:352d..2e46","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.222530Z","level":"INFO","fields":{"message":"established outgoing connection"},"target":"casper_node::components::network::outgoing","span":{"addr":"99.81.225.72:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"99.81.225.72:35000","consensus_key":"PubKey::Ed25519(5dfd..8a21)","peer_id":"tls:352d..2e46","state":"connected","name":"outgoing"},{"addr":"99.81.225.72:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.249432Z","level":"INFO","fields":{"message":"outgoing connection failed","err":"peer is running incompatible version: 1.4.15"},"target":"casper_node::components::network::outgoing","span":{"addr":"13.51.218.68:35000","state":"connecting(8)","name":"outgoing"},"spans":[{"addr":"13.51.218.68:35000","peer_id":"tls:8ba6..c648","state":"waiting(8)","name":"outgoing"},{"addr":"13.51.218.68:35000","state":"connecting(8)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.447660Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-08-30T02:11:14.492891Z","level":"INFO","fields":{"message":"disconnecting after ping retries were exhausted"},"target":"casper_node::components::network::outgoing","span":{"addr":"15.235.53.233:35000","peer_id":"tls:980c..1e0d","state":"connected","name":"outgoing"},"spans":[{"addr":"15.235.53.233:35000","peer_id":"tls:980c..1e0d","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.492941Z","level":"INFO","fields":{"message":"unforgettable address reset"},"target":"casper_node::components::network::outgoing","span":{"addr":"13.51.218.68:35000","state":"waiting(9)","name":"outgoing"},"spans":[{"addr":"13.51.218.68:35000","state":"waiting(9)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.492954Z","level":"INFO","fields":{"message":"disconnecting after ping retries were exhausted"},"target":"casper_node::components::network::outgoing","span":{"addr":"46.101.61.107:35000","peer_id":"tls:2d70..f8c6","state":"connected","name":"outgoing"},"spans":[{"addr":"46.101.61.107:35000","peer_id":"tls:2d70..f8c6","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.493127Z","level":"WARN","fields":{"message":"unexpected drop notification"},"target":"casper_node::components::network::outgoing","span":{"addr":"15.235.53.233:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"15.235.53.233:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.493148Z","level":"WARN","fields":{"message":"unexpected drop notification"},"target":"casper_node::components::network::outgoing","span":{"addr":"46.101.61.107:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"46.101.61.107:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-08-30T02:11:14.698982Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-08-30T02:11:14.950119Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-08-30T02:11:15.200587Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-08-30T02:11:15.248809Z","level":"INFO","fields":{"message":"outgoing connection failed","err":"peer is running incompatible version: 1.4.15"},"target":"casper_node::components::network::outgoing","span":{"addr":"13.51.218.68:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"13.51.218.68:35000","peer_id":"tls:8ba6..c648","state":"waiting(9)","name":"outgoing"},{"addr":"13.51.218.68:35000","state":"connecting(0)","name":"outgoing"}]}

/etc/casper/node_util.py watch

Every 5.0s: /etc/casper/node_util.py node_status ; /etc/casper/node_util.py rpc_active; /etc/casper/node_util.py systemd_status                                                 44f5b87d99ed: Wed Aug 30 02:13:54 2023

Peer Count: 32
Uptime: 15h 38m 5s 525ms
Build: 1.5.2-86b7013
Key: 01b0a7dc4634a18ac84ace7b12c3c333b92644e7d163438babc5ad1ffa91c5e863
Next Upgrade: None

RPC: Not Ready

casper-node-launcher.service - Casper Node Launcher
    Loaded: loaded (/usr/lib/systemd/system/casper-node-launcher.service, enabled)
    Active: active (running)

failed when i using the the dockerfile

when i try to run a node with the dockerfile, it show below message.
but the version i have already used 1_0_0

"/root/pull_casper_node_version.sh", "1_0_0", "mainnet" ,"/root/casper/config_from_example.sh" "1_0_0"
This script is deprecated and will be removed.
Use node_util.py stage_protocols.

Error: Illegal semver format. Please use <major>_<minor>_<patch> such as 1_0_0.

node installation process is stuck

could someone help to take a look this issue. I tried to setup a data node for using. I followed the steps in the official document. after i exec the cmd sudo /etc/casper/node_util.py start, i got many error output as below. now the node was stuck here for a long time, i was wonder are those outputs normal or am I missing anything else?

{"timestamp":"2023-09-22T08:07:55.207417Z","level":"INFO","fields":{"message":"disconnecting after ping retries were exhausted"},"target":"casper_node::components::network::outgoing","span":{"addr":"18.138.162.194:35000","peer_id":"tls:372b..a9e1","state":"connected","name":"outgoing"},"spans":[{"addr":"18.138.162.194:35000","peer_id":"tls:372b..a9e1","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:55.207618Z","level":"WARN","fields":{"message":"unexpected drop notification"},"target":"casper_node::components::network::outgoing","span":{"addr":"18.138.162.194:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"18.138.162.194:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:55.257068Z","level":"INFO","fields":{"message":"new outgoing connection established"},"target":"casper_node::components::network","span":{"addr":"18.138.162.194:35000","consensus_key":"PubKey::Ed25519(f1c5..cec5)","peer_id":"tls:372b..a9e1","state":"connected","name":"outgoing"},"spans":[{"addr":"18.138.162.194:35000","consensus_key":"PubKey::Ed25519(f1c5..cec5)","peer_id":"tls:372b..a9e1","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:55.257135Z","level":"INFO","fields":{"message":"established outgoing connection"},"target":"casper_node::components::network::outgoing","span":{"addr":"18.138.162.194:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"18.138.162.194:35000","consensus_key":"PubKey::Ed25519(f1c5..cec5)","peer_id":"tls:372b..a9e1","state":"connected","name":"outgoing"},{"addr":"18.138.162.194:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:55.335358Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-09-22T08:07:55.586740Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-09-22T08:07:55.838128Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-09-22T08:07:56.089504Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}
{"timestamp":"2023-09-22T08:07:56.209137Z","level":"INFO","fields":{"message":"disconnecting after ping retries were exhausted"},"target":"casper_node::components::network::outgoing","span":{"addr":"54.254.58.185:35000","peer_id":"tls:a7e7..36dc","state":"connected","name":"outgoing"},"spans":[{"addr":"54.254.58.185:35000","peer_id":"tls:a7e7..36dc","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:56.210505Z","level":"WARN","fields":{"message":"unexpected drop notification"},"target":"casper_node::components::network::outgoing","span":{"addr":"54.254.58.185:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"54.254.58.185:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:56.258797Z","level":"INFO","fields":{"message":"new outgoing connection established"},"target":"casper_node::components::network","span":{"addr":"54.254.58.185:35000","consensus_key":"PubKey::Ed25519(df22..d0ee)","peer_id":"tls:a7e7..36dc","state":"connected","name":"outgoing"},"spans":[{"addr":"54.254.58.185:35000","consensus_key":"PubKey::Ed25519(df22..d0ee)","peer_id":"tls:a7e7..36dc","state":"connected","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:56.258854Z","level":"INFO","fields":{"message":"established outgoing connection"},"target":"casper_node::components::network::outgoing","span":{"addr":"54.254.58.185:35000","state":"connecting(0)","name":"outgoing"},"spans":[{"addr":"54.254.58.185:35000","consensus_key":"PubKey::Ed25519(df22..d0ee)","peer_id":"tls:a7e7..36dc","state":"connected","name":"outgoing"},{"addr":"54.254.58.185:35000","state":"connecting(0)","name":"outgoing"}]}
{"timestamp":"2023-09-22T08:07:56.341108Z","level":"INFO","fields":{"message":"Initialize: awaiting sufficient fully-connected peers"},"target":"casper_node::reactor::main_reactor::control"}

here is the status output via curl -s http://127.0.0.1:8888/status | jq -r :

{
  "peers": [
    {
      "node_id": "tls:0177..8046",
      "address": "93.190.141.13:35000"
    },
    {
      "node_id": "tls:03b8..6389",
      "address": "51.158.202.24:35000"
    }
  ],
  "api_version": "1.5.2",
  "build_version": "1.5.2-86b7013",
  "chainspec_name": "casper-test",
  "starting_state_root_hash": "0000000000000000000000000000000000000000000000000000000000000000",
  "last_added_block_info": null,
  "our_public_signing_key": "01be062ce2f3f35c5d82560c7c30f2786c47342cfcc0e5dcbf7937c3e7851a0d45",
  "round_length": null,
  "next_upgrade": null,
  "uptime": "14m 40s 305ms",
  "reactor_state": "Initialize",
  "last_progress": "2023-09-22T08:00:11.209Z",
  "available_block_range": {
    "low": 0,
    "high": 0
  },
  "block_sync": {
    "historical": null,
    "forward": null
  }
}

Feature Request: Add update mode without restarting the casper-node

Currently if an update for casper-node-launcher is available via apt update the upgrade will completely stop and start the casper-node process, too.

I think this is unnecessary and should be avoided to keep the uptime of the node uninterrupted.

If I remember correctly when I upgraded the casper-node-launcher about 2 or 3 months ago I think it even killed the casper-node process in an ungraceful way causing a trie-state check but not sure about this and I don't want to test it for obvious reasons ;-)

Is an upgrade mode possible?

node_util.py + config_from_example.sh should avoid IPv6 for IP detection

The following service is one of the used services to detect the external IP address:

$ nslookup ident.me
Server:         127.0.0.53
Address:        127.0.0.53#53

Non-authoritative answer:
Name:   ident.me
Address: 49.12.234.183
Name:   ident.me
Address: 2a01:4f8:c0c:bd0a::1

It is the only one resolving to an IPv6 address as well.

node_util.py seems to use the IPv6 first but on hosts not configured for IPv6 the DNS lookup can take quite some time and it seems that it is stuck.

config_from_example.sh is using the same service but it is marked as deprecated and it's using curl.

How to fix:
curl can easily be changed to force IPv4 only:

curl -4 https://ident.me

I'm not a python expert and I didn't find a quick and clean solution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.