Giter Site home page Giter Site logo

acehoss / rnsh Goto Github PK

View Code? Open in Web Editor NEW
33.0 33.0 6.0 224 KB

rnsh is a command-line utility written in Python that facilitates shell sessions over Reticulum networks and aims to provide a similar experience to SSH.

License: MIT License

Python 100.00%

rnsh's People

Contributors

acehoss avatar erethon avatar markqvist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

rnsh's Issues

Obtrusive error message on LoRa sessions

LoRa connections, I'm seeing frequent errors like this:

[2023-05-12 10:33:31] [Error]   Decryption failed on link <e1b885eb922404c3882778f70dd1ed47>. The contained exception was: Fernet token HMAC was invalid
[2023-05-12 10:33:31] [Error]   An error ocurred while receiving data on <RNS.Channel.Channel object at 0x103ab1dd0>. The contained exception was: 'NoneType' object is not subscriptable

The data around it seems to be correct--I ran ls /bin over LoRa and that message appeared several times, but it appears that no text was lost. But the error message interrupts the flow of text, and especially in TUI apps really messes up the formatting.

The first error comes from RNS and will probably need an override or option there to suppress that message or reduce its error level.

The second error looks like a channel callback is still called but with a None value rather than data and this is not handled correctly.

Getting an asyncio.exceptions.CancelledError when trying to connect to the BBS

When trying to connect to the BBS mentioned in markqvist/Reticulum#231 I'm getting an asyncio.exceptions.CancelledError error. Platform is Debian 11, Python 3.9.2 and latest RNS (0.4.9). I'm connected to both public nodes of the testnet.

rnsh 1490b99d47d3bac32270cfa90f771af8 -vvvvvv
[2023-02-19 22:16:59] [Debug]   Connected to locally available Reticulum instance via: LocalInterface[37428]
[2023-02-19 22:16:59] [Verbose] Configuration loaded from /home/dgrig/.reticulum/config
[2023-02-19 22:16:59] [Verbose] Loaded 310 known destination from storage
[2023-02-19 22:16:59] [Verbose] Loaded Transport Identity from storage
[2023-02-19 22:16:59] [Info]    rnsh.rnsh._initiate_link       Requesting path... [MainThread]
[2023-02-19 22:16:59] [Extra]   Valid announce for <1490b99d47d3bac32270cfa90f771af8> 2 hops away, received via <4e1ac99b1c4a4161cd06bfa04b74754c> on LocalInterface[37428]
[2023-02-19 22:16:59] [Debug]   Destination <1490b99d47d3bac32270cfa90f771af8> is now 2 hops away via <4e1ac99b1c4a4161cd06bfa04b74754c> on LocalInterface[37428]
[2023-02-19 22:16:59] [Debug]   rnsh.rnsh._initiate_link       No link [MainThread]
[2023-02-19 22:16:59] [Extra]   Registering link <0cfd19c5683e3a6aab16356fe2050334>
[2023-02-19 22:16:59] [Debug]   Link request <0cfd19c5683e3a6aab16356fe2050334> sent to <rnsh.default.afd460e2af7939a622e4faf5ab13e842/1490b99d47d3bac32270cfa90f771af8>
[2023-02-19 22:16:59] [Info]    rnsh.rnsh._initiate_link       Establishing link... [MainThread]
[2023-02-19 22:17:00] [Debug]   Path request for <9513e2d9f8ea795cbac394fbbd547ecd> on LocalInterface[37428]
[2023-02-19 22:17:00] [Debug]   Ignoring path request for <9513e2d9f8ea795cbac394fbbd547ecd> on LocalInterface[37428], no path known
[2023-02-19 22:17:00] [Debug]   Path request for <8ed1daeee0d96214a32b1ecad27201e5> on LocalInterface[37428]
[2023-02-19 22:17:00] [Debug]   Ignoring path request for <8ed1daeee0d96214a32b1ecad27201e5> on LocalInterface[37428], no path known
[2023-02-19 22:17:09] [Verbose] Link establishment timed out
[2023-02-19 22:17:09] [Debug]   rnsh.retry.RetryThread         stopping timer thread [MainThread]
Traceback (most recent call last):
  File "/home/dgrig/reticulum2/lib/python3.9/site-packages/rnsh/rnsh.py", line 538, in _rnsh_cli_main
    return_code = await _initiate(
  File "/home/dgrig/reticulum2/lib/python3.9/site-packages/rnsh/rnsh.py", line 335, in _initiate
    await _initiate_link(
  File "/home/dgrig/reticulum2/lib/python3.9/site-packages/rnsh/rnsh.py", line 315, in _initiate_link
    if not await _spin(until=lambda: _link.status == RNS.Link.ACTIVE, timeout=timeout):
  File "/home/dgrig/reticulum2/lib/python3.9/site-packages/rnsh/rnsh.py", line 227, in _spin
    raise asyncio.CancelledError()
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dgrig/reticulum2/bin/rnsh", line 8, in <module>
    sys.exit(rnsh_cli())
  File "/home/dgrig/reticulum2/lib/python3.9/site-packages/rnsh/rnsh.py", line 560, in rnsh_cli
    return_code = asyncio.run(_rnsh_cli_main())
  File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
asyncio.exceptions.CancelledError

I haven't looked at the source code yet, but I'm happy to answer any questions until then if it helps debugging this.

Sliding window acknowledgements

As link RTT increases, the throughput of the terminal session decreases. This is because currently only one outstanding packet is allowed on the link at a time. The current packet needs to travel to the recipient and the proof packet be received back before the next packet can be sent.

In the rnsh-bbs listener debug logs, it's clear that this is at least one of, if not the primary bottleneck for throughput. The pending message found message in the logs shows where traffic is being held up.

Some amount of back pressure on each end is good--waiting for more data before sending helps increase packet size to improve link efficiency. But if the packets are already maxed out, then it's only increasing latency.

Lack of non-tty mode breaks some scripts

Without a non-tty mode, the remote command's stderr is redirected to stdout, and this causes issues--particularly with binary streams. This is why the file transfer with tar over a pipe doesn't work--there is some stray information printed to stderr at the beginning of the transfer that mangles the tar stream. A forced non-tty mode is needed, like ssh -T that causes stderr to be sent as a separate stream.

On the remote end, this would be implemented by opening three pipes with os.pipe(…), calling os.fork(), and then connecting the pipes to the correct file descriptors with os.dup2(...).

The current implementation using pty.fork() still needs to be used for tty mode, otherwise the remote command doesn't think it is connected to a tty and won't provide interactive features.

Intermittent communications failures, particularly over LoRa

I've seen some intermittent but repeatable failures while testing on LoRa links. Even at higher bandwidths, sessions will spontaneously stop communicating. The exact circumstances are still unclear, but I have a few data points.

  • 911.325 MHz, BW 500 KHz, SF 8, CR 5
  • Connect an rnsh session
  • Run top -s 30 (macOS) ortop -d 30 for Linux -- to keep data running over the session
  • After some amount of time on the order of 5 minutes, no more refreshes will be sent
  • The listener may show that the process has been terminated and the link has been closed.
  • The initiator will likely show nothing, even with a couple of -vs
  • Typing on the initiator session will cause a packet to be sent. This will eventually retry out.
  • Big frustration At this point, initiating a new session may fail to connect link, or actually start the session and remote process but send no data.
  • Since the listener sends no data, the initiator tty is not yet in raw mode and a Ctrl-C will terminate the session.
  • Restarting the listener restores ability to connect.

The issue could be at any layer: rnsh, Reticulum, or RNode. Often it looks like packets aren't being received on one end or the other.

Dependency compatibility for rns 0.5.1

Hi @acehoss!

I was wondering if you could update the rns dependency for rnsh to rns >= 0.5.0 instead of rns == 0.5.0, as it is now. I just released RNS 0.5.1, and due to the dependency issue, rnsh is giving dependency errors in pip.

Also, when you have time, I sent you a message on matrix :)

Rework protocol handling for better version compatibility

Refactor the protocol handling parts of the application to use an interface that can be implemented by multiple protocol versions. This way, when new features are added to the protocol, older versions can interoperate for a longer time.

Remove service name from aspects

Since the service name is included in the aspects, the initiator is required to provide it along with the destination hash to recreate the destination object. This adds unnecessary complexity on the initiator end, since the aspects change the destination hash value. If the destination hash stayed the same for all services using the same identity file, that would make more sense.

Instead, the service name could be provided only on the listener end and used to determine which identity to use. If an identity for that service doesn't exist yet, it will be created. On the listener then, either the identity file or the service name should be provided, not both. The initiator would only ever need to supply the destination hash.

The command line would look like this:

    rnsh -l [--config <configfile>] [-i <identityfile> | -s <service_name>] 
         [-v... | -q...] [-b <period>] (-n | -a <identity_hash> [-a <identity_hash>] ...) 
         [-A | -C] [[--] <program> [<arg> …]]
    rnsh [--config <configfile>] [-i <identityfile>] [-v... | -q...] [-N] [-m] 
         [-w <timeout>] <destination_hash> [[--] <program> [<arg> ...]]

Startup exception on Android

On Android, this exception is thrown at program startup, but it still works fine afterwards (at least as initiator):

Traceback (most recent call last):
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/psutil/_common.py", line 399, in wrapper
    return cache[key]
KeyError: (('/proc',), frozenset())

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/psutil/_pslinux.py", line 285, in <module>
    set_scputimes_ntuple("/proc")
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/psutil/_common.py", line 401, in wrapper
    ret = cache[key] = fun(*args, **kwargs)
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/psutil/_pslinux.py", line 268, in set_scputimes_ntuple
    with open_binary('%s/stat' % procfs_path) as f:
  File "/data/data/com.termux/files/usr/lib/python3.10/site-packages/psutil/_common.py", line 728, in open_binary
    return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
PermissionError: [Errno 13] Permission denied: '/proc/stat'

This looks like a common problem, and a possible solution is to redirect stderr to /dev/null temporarily before the import. It's a little hacky, but just might work.

# set stderr to dev/null
sys.stderr = open(os.devnull, "w")
import psutil

# after importing, set stderr to original 
sys.stderr = sys.__stderr__

From discussion.

Data corruption

The latest changes in RNS have fixed the issues with channels and listeners unresponsive. I ran top with rnsh overnight over both ethernet and LoRa and the sessions are still running this morning--something that was not possible on the previous version.

However, I am noticing some subtle problems. In the session running over ethernet, I'm noticing text out of place after a minute or so. I simultaneously ran a session over SSH and it had no problems. (Screenshots at the bottom)

It seems like macOS top doesn't clear the terminal or the line, but rather only updates characters when they change by positioning the cursor and printing the updated text. This means if a data message is lost or duplicated, it could result in this garbling (particularly if relative cursor positioning is used). This could be a bug introduced by the recent changes in RNS, but I think I may have seen issues like this before that update.

SSH:
macos-top-ssh

rnsh - notice the garbled first few lines compared to the SSH session:
macos-top-rnsh

Linux will show docopts but silently exits when real options specified

After installing the dev build from wheel, rnsh just returns to the command line if a valid set of options is specified, even just -p. Adding verbose flags doesn't change anything.

OS: Ubuntu 20.04.4 LTS x86_64 
Host: SEi TBD by 
Kernel: 5.15.0-58-generic 
Uptime: 28 days, 19 hours, 8 mins 
Packages: 2066 (dpkg), 9 (snap) 
Shell: bash 5.0.17 
Resolution: 1280x720 
Terminal: /dev/pts/3 
CPU: Intel i5-8279U (8) @ 4.100GHz 
GPU: Intel Iris Plus Graphics 655 
Memory: 7803MiB / 15841MiB 

"Unhandled exception: Path not found" when connecting through hops

Hi, and thanks for writing this tool, first.
I am testing it in different configurations, and it works when the peers are directly visible on the same local network, but I have this issue when connected via at least one hop.
The listener waits, and the initiator returns
Unhandled exception: Path not found

In the same configuration, the peers can see each other (tested with nomadnet), and a link can successfully be created using the reticulum utility
rnx

For this specific case, reticulum v 0.5.5 and rnsh v 0.1.1 are used.
Linstener on Pizero2W Raspbian GNU/Linux 10 (buster) (32 Bit)
Initiator MacOs.

Any hints to solve the issue would be appreciated.

Interrupt does not work if the channel is saturated

I was testing running i=0; while [ $i -lt 1000000 ]; do i=$((i+1)); printf "%032d\n" $i; done in a shell session over rnsh, and I could not terminate it with Control-C. Over ssh the termination works fine.

Listener: new protocol does not handle no-auth connections correctly

Even with the --no-auth option set on the listener, a session still follows the same state sequence, starting at LSSTATE_WAIT_IDENT after link establishment, rather than advancing directly to LSSTATE_WAIT_VERS. The initiator_identified callback needs a tweak as well then to handle an identification event even if it was not necessary.

The initiator identifies by default, unless the -N option is specified, so unless a user explicitly opts out, the protocol will not break for them.

Listener stops responding to new links after some time

Without fail, the listener stops responding to new links after an hour or so. I'm not sure if the time is specific or random, but it seems to always happen. This is the same issue that caused #9.

  1. Start/Restart the listener
  2. Connect and it works fine
  3. Disconnect and leave it idle for least 30 minutes
  4. Can't connect anymore

De-duplicate packets

First I need to switch to retrying with Packet.resend() instead of creating a new packet when a timeout occurs. I assume that RNS will automatically de-duplicate packets that use resend.

Whatever the case, the retry mechanism I'm using currently creates a new packet after the timeout, but this has occasional issues with the original packet eventually arriving and causing spurious data in the stream or breaking the protocol.

And then once the Buffer API is done, all this changes (for the better!)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.