Giter Site home page Giter Site logo

aletheiawarellc / wink Goto Github PK

View Code? Open in Web Editor NEW
0.0 4.0 1.0 87 KB

Fault Tolerance Framework

License: Apache License 2.0

CMake 3.12% Dockerfile 0.56% C++ 96.32%
finite-state-machine hierarchical-state-machine state-machine fault-tolerance asynchronous-communication error-handling fault-isolation lifecycle-monitoring client-server-architecture hot-code-reload

wink's Introduction

Wink

Wink is framework for developing Fault Tolerant Systems with Asynchronous Concurrent Independent Hierarchical Finite State Machines.

Fault Tolerance

A system is fault-tolerant if it continues working even if something is wrong.

Fault-tolerance cannot be achieved using a single computer โ€“ it may fail.

-- Joe Armstrong

Principles

  1. Isolated Fault Units - split software into multiple small independent modules that run concurrently to achieve a single objective or fail fast.
  2. Supervision Tree - modules can monitor the lifcycles of submodules.
  3. Message Passing - modules communicate asynchronously and do not share state.
  4. Live Upgrade - modules can be updated without restarting the whole system.
  5. Persistent Records - logs are written to storage that can survive reboot.

State Machine

A State Machine;

  • Runs Independently and Concurrently for Fault Isolation.
  • Has a Lifecycle that is monitored by the Spawner for Error Detection and Resolution.
  • Communicates Asynchronously to minimize Latency and maximize Throughput.
  • Can be Updated and Restarted without affecting others State Machines.
  • Writes Logs to either stdout or the filesystem for Debuggability.
  • Has a Hierarchy of States to minimize Code Duplication.
  • Is Uniquely Identified by its Binary Name and Network Address;
    • binary name consists of;
      • package and executable: family/Parent or family/Child
      • optional semantic versioning: family/Parent0.0.1
      • optional tag: family/Child:Alice or family/Child:Bob
      • compound example: family/Child1.2.3:Alice
    • address can either be local, or remote;
      • local: :<port> or localhost:<port> or 127.0.0.1:<port>
      • remote: <ip>:<port>

States

Each State consists of a unique Name, an optional Parent, an optional Entry Action, an optional Exit Action, and a set of Receivers.

Actions

An action is triggered when a state is entered or exited.

Receivers

A receiver is triggered upon receipt of a matching message.

If the optional empty receiver exists, it is triggered if no other receivers match, else the unhandled message is handled by the parent state. If no parent exists, or the message is not handled by the hierarchy, an error is raised.

Example

#include <iostream>
#include <string>

#include <Wink/address.h>
#include <Wink/log.h>
#include <Wink/machine.h>
#include <Wink/state.h>

int main(int argc, char **argv) {
  if (argc < 3) {
    error() << "Incorrect parameters, expected <address> <spawner>\n"
            << std::flush;
    return -1;
  }

  std::string name(argv[0]);
  UDPSocket socket;
  Address address(argv[1]);
  Address spawner(argv[2]);
  Machine m(name, socket, address, spawner);

  m.AddState(std::make_unique<State>(
      // State Name
      "off",
      // Parent State
      "",
      // On Entry Action
      []() { info() << "Switch is OFF\n"
                    << std::flush; },
      // On Exit Action
      []() {},
      // Receivers
      std::map<const std::string, Receiver>{
          {"on", [&](const Address &sender,
                     std::istream &args) { m.GotoState("on"); }},
          {"off", [&](const Address &sender,
                      std::istream &args) { m.GotoState("off"); }},
      }));

  m.AddState(std::make_unique<State>(
      // State Name
      "on",
      // Parent State
      "off",
      // On Entry Action
      []() { info() << "Switch is ON\n"
                    << std::flush; },
      // On Exit Action
      []() {},
      // Receivers
      std::map<const std::string, Receiver>{}));

  m.Start();
}

Lifecycle

When a State Machine spawns another, the parent receives lifecycle messages from the child.

In the success case, the parent will receive;

  • started - the child indicates it has started and provides the parent with the name of its binary and the address (ip:port) it has bound to.
  • pulsed - the child indicates it is still alive by sending a heartbeat message every 2 seconds.
  • exited - the child indicates it has terminated.

In the error case, the parent will receive;

  • started - same as above.
  • pulsed - same as above.
  • errored - the child indicates it has encountered an error by sending the error message to the parent.
  • exited - same as above.

If a parent doesn't receive a heartbeat for 10 seconds it will assume the child has failed (maybe the computer crashed, lost power, or the network disconnected - who knows?!).

When a parent is notified that a child has errored, it can chose to do nothing, restart the child, or raise an error. In the last situation, the grandparent will be notified that the parent has errored.

Example

#include <iostream>
#include <string>

#include <Wink/address.h>
#include <Wink/log.h>
#include <Wink/machine.h>
#include <Wink/state.h>

int main(int argc, char **argv) {
  if (argc < 3) {
    error() << "Incorrect parameters, expected <address> <spawner>\n"
            << std::flush;
    return -1;
  }

  std::string name(argv[0]);
  UDPSocket socket;
  Address address(argv[1]);
  Address spawner(argv[2]);
  Machine m(name, socket, address, spawner);

  m.AddState(std::make_unique<State>(
      // State Name
      "main",
      // Parent State
      "",
      // On Entry Action
      [&]() {
        info() << "Parent: OnEntry\n" << std::flush;
        m.Spawn("family/Child");
      },
      // On Exit Action
      []() { info() << "Parent: OnExit\n"
                    << std::flush; },
      // Receivers
      std::map<const std::string, Receiver>{
          {"started",
           [&](const Address &sender, std::istream &args) {
             std::string child;
             args >> child;
             info() << "Parent: " << sender << ' ' << child << " has started\n"
                    << std::flush;
           }},
          {"pulsed",
           [&](const Address &sender, std::istream &args) {
             std::string child;
             args >> child;
             info() << "Parent: " << sender << ' ' << child << " has pulsed\n"
                    << std::flush;
           }},
          {"errored",
           [&](const Address &sender, std::istream &args) {
             std::string child;
             args >> child;
             std::ostringstream os;
             os << args.rdbuf();
             info() << "Parent: " << sender << ' ' << child
                    << " has errored: " << os.str() << '\n'
                    << std::flush;
           }},
          {"exited",
           [&](const Address &sender, std::istream &args) {
             std::string child;
             args >> child;
             info() << "Parent: " << sender << ' ' << child << " has exited\n"
                    << std::flush;
             m.GotoState("main"); // Retry
           }},
      }));

  m.Start();
}
#include <iostream>
#include <sstream>
#include <string>

#include <Wink/address.h>
#include <Wink/log.h>
#include <Wink/machine.h>
#include <Wink/state.h>

int main(int argc, char **argv) {
  if (argc < 3) {
    error() << "Incorrect parameters, expected <address> <spawner>\n"
            << std::flush;
    return -1;
  }

  std::string name(argv[0]);
  UDPSocket socket;
  Address address(argv[1]);
  Address spawner(argv[2]);
  Machine m(name, socket, address, spawner);

  m.AddState(std::make_unique<State>(
      // State Name
      "main",
      // Parent State
      "",
      // On Entry Action
      [&]() { m.Error("AHHHHH"); },
      // On Exit Action
      []() {},
      // Receivers
      std::map<const std::string, Receiver>{}));

  m.Start();
}
Directory: build/samples/                                                       # Directory of Binaries
Address: 127.0.0.1:42000                                                        # Server Listening Address
< 127.0.0.1:50498 start family/Parent :0                                        # Server Receives Client Request
Forked: 41972                                                                   # Process Created
127.0.0.1:56950 family/Parent started                                           # Parent Starts
127.0.0.1:56950 family/Parent > 127.0.0.1:50498 started family/Parent           # Parent Sends Lifecycle Event to Client
127.0.0.1:56950 family/Parent > 127.0.0.1:42000 register family/Parent 41972    # Parent Registers with Server
Parent: OnEntry                                                                 # Parent Enters `main` State
127.0.0.1:56950 family/Parent > 127.0.0.1:42000 start family/Child :0           # Parent Issues Spawn Request
< 127.0.0.1:56950 register family/Parent 41972                                  # Server Receives Parent Registration
< 127.0.0.1:56950 start family/Child :0                                         # Server Receives Parent Spawn Request
Forked: 41973                                                                   # Process Created
127.0.0.1:64701 family/Child started                                            # Child Starts
127.0.0.1:64701 family/Child > 127.0.0.1:56950 started family/Child             # Child Sends Lifecycle Event to Parent
127.0.0.1:64701 family/Child > 127.0.0.1:42000 register family/Child 41973      # Child Registers with Server
127.0.0.1:64701 family/Child errored: AHHHHH                                    # Child Triggers an Error
127.0.0.1:56950 family/Parent < 127.0.0.1:64701 started family/Child            # Parent Receives Child Lifecycle Event
127.0.0.1:64701 family/Child > 127.0.0.1:56950 errored family/Child AHHHHH      # Child Sends Lifecycle Event to Parent
127.0.0.1:64701 family/Child exited                                             # Child Exits
127.0.0.1:64701 family/Child > 127.0.0.1:56950 exited family/Child              # Child Sends Lifecycle Event to Parent
< 127.0.0.1:64701 register family/Child 41973                                   # Server Receives Parent Registration
Parent: 127.0.0.1:64701  has started                                            # Parent Logs Child Lifecycle Event
127.0.0.1:64701 family/Child > 127.0.0.1:42000 unregister                       # Child Unregisters with Server
127.0.0.1:56950 family/Parent < 127.0.0.1:64701 errored family/Child AHHHHH     # Parent Receives Child Lifecycle Event
Parent: 127.0.0.1:64701 family/Child has errored:  AHHHHH                       # Parent Logs Child Lifecycle Event
127.0.0.1:56950 family/Parent < 127.0.0.1:64701 exited family/Child             # Parent Receives Child Lifecycle Event
< 127.0.0.1:64701 unregister                                                    # Server Receives Child Unregistration
Parent: 127.0.0.1:64701 family/Child has exited                                 # Parent Logs Child Lifecycle Event
Parent: OnExit                                                                  # Parent Exits `main` State

Repository Layout

  • include: header files
  • samples: code samples
  • src: source code files
  • test: test code files

Build

cmake -S . -B build
cmake --build build

Test

(cd build/test/src && ctest)

Docker

docker build . -t wink:latest

Usage

Client

./build/src/Wink

Start

Starts a new machine from a binary.

./build/src/Wink start [options] <binary>
./build/src/Wink start [options] <binary> :<port>
./build/src/Wink start [options] <binary> <ip>:<port>

Stop

Stops an existing machine.

./build/src/Wink stop [options] :<port>
./build/src/Wink stop [options] <ip>:<port>

Send

Sends a message to a machine.

./build/src/Wink send [options] :<port> <message>
./build/src/Wink send [options] <ip>:<port> <message>

List

List existing machines running on a server.

./build/src/Wink list [options]
./build/src/Wink list [options] :<port>
./build/src/Wink list [options] <ip>:<port>

Help

./build/src/Wink help
./build/src/Wink help <command>

Server

Starts the WinkServer serving from the given directory.

./build/src/WinkServer serve <directory>

Help

./build/src/WinkServer help
./build/src/WinkServer help <command>

Inspired by @winksaville

wink's People

Contributors

stuartmscott avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

winksaville

wink's Issues

Reconsider using bin-name+network-address as the Unique ID

Identifiers are very important and I think there will be a need to link ID's into a hierarchy for versioning, "VersionedId". Also, as I see it, an SM instance will "advertise" a set of Protocols it supports. Where a Protocol is a set of Messages. So each SM, Protocol and Message should have a VersionedId.

As for SMs, I don't see them having a static location nor will a single binary have a single instance so I don't think bin-name+network-address is an adequate ID. Also, networks will likely not be used for SMs communicating within a process which, IMHO, will be the most common case for how SMs communicate.

I'd like to suggest considering something like a UUID or Web3 ID as the basis for identification. I have no firm idea how to create a VersionedId, but I'm sure it's possible, maybe using Web3.

Add Macros for Testability

The addition of some ASSERT_ macros would help make the tests more readable;

  • assert counter (#45):
    • assert onEntry counter
    • assert onExit counter
    • assert receiver(message) counter
  • assert state transition
  • assert message was sent
  • assert child was spawned

Add Counters to Actions and Receivers

Adding counters inside State Machines which are incremented for each invocation of the OnEntry & OnExit actions, and message receivers would increase testability by enabling developers to easily assert that code was triggered.

Pass DOWNLOAD_EXTRACT_TIMESTAMP option or set the CMP0135 policy

I forked and cloned the repo and while building the first time got the following message:

wink@3900x 22-12-18T18:27:25.387Z:~/prgs/AletheiaWareLLC/forks/aw-wink (master)
$ cmake -S . -B build
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Warning (dev) at /usr/share/cmake/Modules/FetchContent.cmake:1279 (message):
  The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
  not set.  The policy's OLD behavior will be used.  When using a URL
  download, the timestamps of extracted files should preferably be that of
  the time of extraction, otherwise code that depends on the extracted
  contents might not be rebuilt if the URL changes.  The OLD behavior
  preserves the timestamps from the archive instead, but this is usually not
  what you want.  Update your project to the NEW behavior or specify the
  DOWNLOAD_EXTRACT_TIMESTAMP option with a value of true to avoid this
  robustness issue.
Call Stack (most recent call first):
  CMakeLists.txt:21 (FetchContent_Declare)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found Python: /usr/bin/python3.10 (found version "3.10.8") found components: Interpreter 
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Configuring done
-- Generating done
-- Build files have been written to: /home/wink/prgs/AletheiaWareLLC/forks/aw-wink/build

Optimize Local Message Passing

Messages send to self, or to other State Machines on the same host could avoid going through the networking stack and therefore gain a performance improvement.

Split out from #11

Transition After Action

Currently State transition occur immediately when GotoState is called which leads to some footguns.

Instead GotoState should record the next state and only after the current action (OnEntry, OnExit, Receiver) has returned, is the transition made.

There is still an open question around multiple invocation of GotoState within the same action, with two possible answers;

  • The first invocation wins, and subsequent ones are ignored.
  • The last invocation wins, so subsequent calls can override the next state.

Initial discussion: #43 (comment)

Hierarchical Transitions

State Transitions should move through the hierarchy, so that leaving the old state exits all the parent states going up the tree until the common ancestor (if any), and then enter all the parent states going down the tree and into the new state.

Ie. given a State Machine:

         parent
         /    \
        /      \
      leaf1    child1
               /   \
              /     \
            leaf2  leaf3
  • Transitioning from leaf2 to leaf3 results in the execution of leaf2.OnExit and leaf3.OnEntry
  • Transitioning from leaf1 to leaf3 should result in the execution of leaf1.OnExit, child1.OnEntry, and leaf3.OnEntry
  • Transitioning from leaf3 to leaf1 should result in the execution of leaf3.OnExit, child1.OnExit, and leaf1.OnEntry

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.