pendulum-project / ntpd-rs Goto Github PK
View Code? Open in Web Editor NEWA full-featured implementation of the Network Time Protocol, including NTS support.
Home Page: https://tweedegolf.nl/en/pendulum
License: Other
A full-featured implementation of the Network Time Protocol, including NTS support.
Home Page: https://tweedegolf.nl/en/pendulum
License: Other
When trigering the panic threshold, the ntp-daemon goes into an unresponsive state but does not properly terminate. It should just terminate abnormally.
c.t
is special; in the spec it is a seconds counter. Idea: use rust Instant
The management client (#138) will need end-user documentation written for it once it is closer to its final shape.
This is a departure from the NTP specification, however the security gains, especially against DOS attacks is such that it is worth it.
This is hopefully covered well by the socket mechanism needed for #119
Preferably without dependencies on non-core crates (prefer not to use nix, see also #14). If this requires some unsafe code then that would ideally be put in a separate crate.
Write an updated documentation describing code structure and main design decisions.
somehow manually trigger the poll
Ensure that peers have access to (a copy of) needed system state values.
A number of constants currently used in ntp-proto really should be configurable. Make these configurable by caller.
Mainly for devices on unstable internet connections, e.g. laptop that switches wifi networks
There are currently a few, and probably in the future will be a few more, places where we can and do do checks that essentially represent invariants that should always hold, regardless of any input provided from external sources. As such, failure of these checks directly indicate bugs in our code, and the question then becomes what should the behaviour of these checks be in release builds.
Given the specific nature of NTP, especially for an NTP client, I personally am of the opinion that the safer option is to actively blow up upon detection of such errors. This is because, assuming we detected it early enough, the system time now is hopefully reasonably correct, and in that case without corrections shouldn't drift to the point in the short term, and at the same time blowing up makes the issue very visible to whomever is managing the server running the client. However, silently ignoring the error or trying to work around it could result in incorrect steering of the clock (since the software is now in a state that was never anticipated), and incorrect steering could potentially result in significant clock deviation from UTC fairly quickly, and furthermore is far less visible to whomever is managing the server running the client, increasing the potential for a faulty situation to last for a significant time interval.
Is this the view we want to take as project, or are there arguments to the contrary that I am forgetting about here.
Have a datastructure and sufficient signaling to the owner of that structure to keep it up to date.
start task, see if behavior of unix socket is right
In NTP/Chrony there is support for pools:
A pool uses multiple DNS query results to the pool address to get additional peers to connect to. A single pool can instantiate multiple peers. This is different from a traditional server directive which only instantiates a single peer connection.
Ensure we can reset peer measurement state after clock stepping. Includes canceling/ignoring result of current poll if already started. Peer should confirm the occurence of the reset back to where it was initiated from.
NOTE: Polling state (how often we are allowed to poll and such) should be kept intact.
Write additional documentation describing operational procedures and concerns that should be taken into account.
We are currently using chrony to synchronize with AWS clocks. Would this tool be able to replace chrony? If so how would that work, roughly?
have the client code talk to an RwLock
with the relevant data, check that it is updated
we have two unix sockets, by default
/run/ntpd-rs/log-level
/run/ntpd-rs/config
the log level is unprotected, the config needs additional permissions.
we can use https://docs.rs/tokio/latest/tokio/net/struct.UnixStream.html
for sending data over the socket, use https://docs.rs/postcard/latest/postcard/ ? (or send json as bytes?)
client --set-log-level=debug
client --step-if-bigger=1000 --step-first-updates=10
then we also need some observability features, some ideas
client peers list # lists all remotes we are connected to
client peers watch # show for each connected peer its `PeerStatus`
Some failure modes:
I think we may want to try and emit specific exit codes for these well known failure modes, so that they can be distinguished from other errors and panics. We also want to specifically make sure that we emit an error level log message before exiting the program to make sure that such a message pops up in a system that monitors the log messages.
Add a command line option/config option that enables json based output instead of the current text based format.
We need some way to have software timestamping work with tokio in a proper way. Would prefer to have thin unsafe wrappers for the system calls in separate crates, prefer also to have few dependencies (this shouldn't be too much code, and it is probably better to own it ourselves than have a dependency on something like nix, libc is acceptable though in my view)
Implement proper killing-off of associations with peers that want nothing to do with us.
Current work has created a bit of a mess, we need to
Write a (short) readme explaining what this repo is and what the current state of it is.
The following sequence of events is possible
Which results in the steering code using incorrect state from peer C
Figure out what errors need extra work to recover from, and implement what is neccessary for those.
When started with empty peer list, no indication of an issue is given, instead the daemon stays very silent.
Setup basics for logging and add some logging in relevant places.
In preparation for pools we should have tools to allow dynamic adding and removing of peers. (e.g. at runtime)
give clap a vector of strings, and assert the right things are set
One easy way to discover peer addresses would be by using SRV records in DNS. This would make the client a lot easier to use in many cloud-based environments.
reported by jsha
I notice a bug in the panic calculation: offset_too_large compares an offset against the panic threshold without first calling .abs() on it. This means that negative offsets will never be considered too large and will never cause a panic. By contrast, checking the step threshold does call .abs().
Implement the state machine needed for actually doing clock adjustments. (good luck)
This helps check dependency licenses, security vulnerabilities and other stuff. Something like this:
https://github.com/InstantDomain/instant-distance/blob/main/.github/workflows/rust.yml#L79
Implement a proper mechanism for configuring the daemon. E.g. whcih parameter values as needed by proto+which servers to connect to.
Figure out a good api and implement kernel-level send timestamping (stretch goal)
Often used in cloud based environments, see https://www.consul.io/
Describe which configuration options our ntp client provides and how they can be used.
Also should describe current scope of project.
Helps find problems early.
Due to the logic used to determine peer polling intervals, they can never decrease. As this is unwanted, figure out better logic that does allow the polling interval to decrease.
composition of read/write is identity
This would probably involve setting up capabilities (CAP_SYS_TIME specifically for our case) and setuid mechanics (to switch away to a user with no meaningful permissions on the system).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.