buildersoftio / andyx Goto Github PK

Buildersoft Andy X Project

License: Apache License 2.0

C# 99.83% Dockerfile 0.17%

andy streaming distributed-systems event-sourcing event-driven data-pipeline reactive-architecture

andyx's Introduction

What is Andy X?

Andy X is an open-source distributed streaming platform designed to deliver the best performance possible for high-performance data pipelines, streaming analytics, streaming between microservices and data integrations.

Get Started

Follow the Getting Started instructions how to run Andy X.

For local development and testing, you can run Andy X within a Docker container, for more info click here

How to Engage, Contribute, and Give Feedback

Some of the best ways to contribute are to try things out, file issues, join in design conversations, and make pull-requests.

Reporting security issues and bugs

Security issues and bugs should be reported privately, via email, [email protected]. You should receive a response within 24 hours.

Related projects

These are some other repos for related projects:

Andy X Portal - Dashboard for Andy X Node
Andy X Cli - Manage all resources of Andy X

Deploying Andy X with docker-compose

Andy X can be easily deployed on a docker container using docker-compose, for more info click here

Code of conduct

This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community.

For more information, see the .NET Foundation Code of Conduct.

Support

Let's do it together! You can support us by clicking on the link below!

andyx's People

Contributors

Stargazers

Watchers

Forkers

moritzrenkin

andyx's Issues

Change default environment variable names

As a developer when I run the Andy X on docker, I would like to configure with ANDYX-ENVIRONMENT

Expose endpoints for Andy CLI, Dashboard and 3rd party

As a developer I want to use REST endpoints to read Tenants, Products, Components, Topics, Consumers, Producers and Storages.
Also, as a developer I want to be able to create new tenants and manage settings from Tenants, Components and Topics

Implement RocksDB as Topic Log

As user I want to store topic logs into RocksDb and not into SQLite

Consumers with same name can not consume messages on different topics

As a Developer when I consume different events from different topics and I use the same name for these consumers, It fails to consume as described on the image below.

If the topics or components or products or topics are different it should allow to consume because are different topics

Create SyncService Consumers and Producers connected between Nodes in a Cluster

As a user I want to have the same configuration between Nodes in a cluster.
As a user I would like to see on all Nodes if a consumer or producer is connected to the cluster.

Implementation of Topic Rest Endpoints

Implement GET, PUT and POST Endpoints for Topic

Add InitialPosition Support for Subscriptions

As a User of Andy X, I would like to start consuming from the beginning or from the latest message that comes in Andy X.

Add BasicAuthorization to Rest Endpoints

Synchronize Tokens between Nodes in Andy X Cluster

As I developer I want to syncronize tokens between Nodes connected

Enable tenants.json configuration file

As a devops I would like to control tenants configuration from a json config file

Implement temporary message storage

As user I want all mutations in the node to be stored into .bin files, and later these files to be processed by Storage Synchronizer

Implement Metrics

Implement Metrics across Andy X Cluster for
Tenant, Product, Component, Topics, Consumers and Producers

Implement Message Consumption

Implement Message Consumption logic when Consumers conencts to the node.

Create the acknowledgement_log db to store the history of message acknowledgement and cursor_log db to store the current online position for x consumer.

In the pointer_log should be stored the unacked messages, but this will depend on the Subscription mode.
Modes Durable and NonDurable.
If the Mode is Durable, if a message is not acked the new message will not be sent from the Andy X Node.
If the Mode is NonDurable, the not acked messages will be stored and the node will re-deliver again in the future, the cursor position will move ahead currentMessage+1;

Subscription Types for the Consumer will continue to be Exclusive, Failover and Shared.

Shared Subscriptions by default will be NonDurable.

Message acknowledgements will have 3 statuses
Acked, UnAcked and Skipped.

If the acked is skipped, the current position will move ahead, if the message is unacked the current position will stay in the same place and the Node will re-deliver the same message again but this depends on the Consumer Mode.

position_log db will persist the current position of the cursor, and there will be records for each connection.

position_id	cursor	mark_delete_position	read_position	is_connected	pending_read_op	entries_since_last_unacked
10001	consumer_name	ledger_id:-1	{ledger_id}:0	true	0	55

Add Weight on Storage

Implement StorageCalculatorService

As user of Andy X, Storages connected to a node/s, connected as shard should have the same ~ storage size.

A service that will check the size of storage and will will sent the size to the node. Node will continue sending messages to nodes that have the shortest size.

Authentication and Authorization

Enable token validation for Producer/Consumer on Connection.
Enable token validation for Storage on Connection.

Expose Number of messages produced and consumed at consumers and producers

Add SentDate to Message

Implementation of Azure Storage for Topic Data Offload

//TODO.. Add Content

Implementation of Product Rest Endpoints

Implement GET, PUT and POST Endpoints for Product, Product Settings, Product Tokens and Product Retention

Enable in-memory message storage

Implement in-memory message storage.
Create In-memory message repository for topic

Create refresh token for expired tokens on AuthorizationMiddleware

Tokens should have an extra refresh token to re-generate tokens when token is expired.
This implementation should happen in AuthorizationMiddleware

Implementation of Retention Policy TTL

This issue is about implementing the background service for Retention,
As we would like to have retention policies based on tenant, product and component.

The level of priority for retention policy is the one which is the nearest to Topic.
Level 0 Component Retention Policies;
Level 1 Product Retention Policies;
Level 2 Tenant Retention Policies;

In this issue the logic should be implemented for all 3 levels for HARD_TTL and SOFT_TTL.

SOFT_TTL, in case when a message was not consumed from a subscription will not be deleted if the TTL has come.
HARD_TTL, it will delete the message even if is not consumed by a subscription, these messages will be skiped from subscription.

Add support for Bulk Producing

As developer using Andy X, I want to produce a list of messages to Andy X Node

Implementation of Retention Policy: Acknowledged Messages

As I user I would like to implement Retention Policies for Component.

Using Andy X CLI, I would like to add a Retention Policy based on if all subscriptions in a topic have acknowledged messages.

TODO: Add content

Implementation of Aws S3 for Topic Data Offload

//TODO: Add Content

Implementation of Data Offload to Tiered Storage

As a user of Andy X, I would like to offload the data already acked from subscriptions to S3-like blob storage

TODO: Add content

Add Storage Configuration

Add Storage Metadata Configuration

Implement Message Lineage

Implement diagrams for Message Lineage across Producers and Consumers.

Redesign logging for Andy X Node

Add andyx cli into Andy X Node Docker Image

As user of Andy X, I would like to be able to execute commands into andyx node from cli from inside the container.

Implementation of Cluster Rest Endpoints

As user I want to use Cluster Endpoints to read cluster configuration from Andy X Portal and Andy X Cli.
Also, in this issue we will develop in-memory cluster manager, and temp directories for async communication

Implement Retention Period for the messages

As a Devops I would like to configure Retention period for each Component, Also I want to be able to add and change from andyx-cli

Implementation of Andy X Replicated Async Cluster

As a User of Andy X I would like to deploy more than one instance of Andy X and connect them as a Cluster.

On Andy X there will be three different node links within a andy x cluster.

Distributed Sync & Async Nodes,
Replicated Nodes
Production Cluster (connection between distributed and replicas)

Replicated Async Cluster

The configuration of nodes in Replicated Async Cluster is done in the logic of Master/Slave, in terminology of Andy X we are using Main and Worker Nodes.

As is described in the diagram above, when a Producer is connected to Node 1 which is the MAIN NODE, when messages are accepted and stored in the node, asynchronously messages will mirror to other WORKER NODES. In the consumer is conencted as in the diagram to Node 3, it will consume messages already stored in Node 3.

Producers connected in different Nodes in the same topic

In case as is shown in the diagram above, Producer_2 is producing message 4 into Node 2 (which is a worker node), this message will be stored in a temp storage for the main node, the processing and storing will happen from the Main Node, as soon as is processed from Node 1 (main node), the message will be replicated to other worker nodes.

If the Main Node will go down, one of the working nodes will act as Main Node.

the switch between Main and Working Nodes can be done via Andy X CLI and Andy X Portal, by Promoting Nodes.

    andyx cluster "default_01" promote "node_2"

The configuration of the cluster is done in cluster_config.json
example

{
	"Name": "default_01",
	"Shards": [
		{
			"replicas":[
				{
					"NodeId": "01"
				},
				{
					"NodeId": "02"
				}
			]
		},
		{
			"replicas":[
				{
					"NodeId": "03"
				},
				{
					"NodeId": "04"
				}
			]
		},
		{
			"replicas":[
				{
					"NodeId": "05"
				},
				{
					"NodeId": "06"
				}
			]
		}
	]
}

Implementation of Andy X Distributed Async Cluster

As a User of Andy X I would like to deploy more than one instance of Andy X and connect them as a Cluster.

On Andy X there will be three different node links within a andy x cluster.

Distributed Sync & Async Nodes,
Replicated Nodes
Production Cluster (connection between distributed and replicas)

Distributed Async Cluster

As is described in the diagram above, a Producer is connected to Node 1, and a Consumer is connected to Node 3. In the async distributed cluster, if the Producer produces three messages, the first message will be stored to Node 1 Storage, the second message will be stored temporary in Node 1, as message dedicated for Node 2, the same thing will happen will message 3. Asynchronously from temp storage the messages will be sent to specific node alocated from Node 1, as soon as these messages are accepted from the nodes, it will be deleted from that temp storage, as is described in the diagram bellow.

The same parameters exists when one of the nodes is down, but what is different here, is that the messages that were dedicated for that node will be stored in the node when the producer is connected, as soon as that node is working the syncronization between nodes in cluster will happen, and Consumers will not work if one of the nodes is not active.

Add docker-compose exampe

Add docker-compose examples for Andy X

Add PriorityQueues of Cortex to Node Consumers

Create PriorityQueue foreach consumer in-memory at node level

Store Messages as Batch in Storage

Node should send messages to store in the storage as bulk

Implement Initial Cluster Infrastructure

This issue is being used to implement the core abstraction of clustering logic for Andy X Cluster.

Implementing

NodeConfiguration for NodeId, to help storing the messages into topics with id {node_id}:{entryId}
Rename TenantMemoryRepository to TenantMemoryService
Create Topic Temp Directories for Clustering (create temporary rocksdb)

Add cluster_config.json support

As Devops Engineer I want to configure the cluster of nodes and storages in my system.

Enable 'andyx-cli' inside Andy X Node to create manage running node or cluster

Create a console application that is part of andyx solution. This console application should work only with arguments.

Also add this console app to environment variables to run this app as andyx-cli -parameters

Implementation of Subscription Rest Endpoints

Implementation of Andy X Distributed Sync Cluster

As a User of Andy X I would like to deploy more than one instance of Andy X and connect them as a Cluster.

On Andy X there will be three different node links within a andy x cluster.

Distributed Sync & Async Nodes,
Replicated Nodes
Production Cluster (connection between distributed and replicas)

Distributed Sync Cluster

As is described in the diagram above, if a producer is connected to Node 1, and Node 1 is connected to Node 2 and Node 3 in Distributed Cluster, the data will be syncronized as described. If the there will be three messages accepted by Node 1, Node 1 will store the first message and it will delegate the message to other nodes as already stored, when the second message arrived in Node 1, it will delegate to Node 2 to store, also it will send the message to node 3 as already stored.

Nodes in memory will use PriorityQueue to store messages from different Nodes as is described in the diagram bellow.
As the messages will be stored indexed by Node in the cluster and the entry of the

messages in that Node {node_id}:{entry_id}.

In distributed cluster topics are also known as distributed topics

In Distributed Cluster in one of the nodes is down, the production of message continues working properly, but the consumption of messages will go down.

message count in topic

As user I would like to see the count of messages that are currently in the topic

Implementation of Component Rest Endpoints

Implement GET, PUT and POST Endpoints for Component, Component Settings, Component Tokens and Component Retention

Expose RevokeTenant/Componenet Token Endpoints

As a developer I want to use Rest Endponts to revoke a token

Tenant configuration is not being updated

tenants_config.json is not being updated when Products, Components and Topics are created/updated

Add Header to Records/Message

As a developer a want to send some KeyValuePair as Header for each Record/Message.

InitialPosition.Latest is not working with unacked message

As a User Unacked messages are not being stored for Consumers with InitialPosition.Latest

Implement Storage Synchronizer

Storage Synchronizer is a standalone process that will consumer .bin files written from the Producer Node and will store into topics.

The storage will be done into Ledgers, each ledger will have around 5000 messages stored.

The payload of the message will be stored as binary.

Discussion: Should we make Ledgers created every one hour or should we have it done by the numbers of messages written into the Ledger.

For each Ledger Storage Synchronizer will store the status into ledger_logs and will do snapshots.
Should ledger_logs be stored as .json configuration file or should me use SQLite to store the configuration.
- Recommandation is using SQLite db.

ledger_id	ledger_location	entries	createdDate	status	size
10001	root/data/topic/msg_10001_date.andx	5000	2022-05-23	Closed	100MB
10002	root/data/topic/msg_10002_date.andx	6500	2022-05-23	Closed	100MB
10003	root/data/topic/msg_10003_date.andx	75	2022-05-23	Opened	100MB

On Consumer Disconenct logging for subscription type is wrong for Shared and Failover

When Consumers with subscription type Shared and Failover are connected and consuming, when they are disconnected in storage-logs it says the subscription type 'exclusive'.

Node should sent the subscription_type when consumer is disconnected.

buildersoftio / andyx Goto Github PK

andyx's Introduction

What is Andy X?

Get Started

How to Engage, Contribute, and Give Feedback

Reporting security issues and bugs

Related projects

Deploying Andy X with docker-compose

Code of conduct

Support

andyx's People

Contributors

Stargazers

Watchers

Forkers

andyx's Issues

Replicated Async Cluster

Producers connected in different Nodes in the same topic

Distributed Async Cluster

Distributed Sync Cluster

Recommend Projects

Recommend Topics

Recommend Org