Giter Site home page Giter Site logo

distrise's Introduction

Distrise

TBD

Modules

  • Nostr-Core: Nostr protocol implements Client and Relay event message.
  • Nostr-Relay: Work as Relay, can handler client events and subscribe.
  • Nostr-Gateway: Work as Relay event aggregator, subscribed Relays and handle event by RabbitMQ and store data to Cockroach database.

Required

  • JDK17

Questions

Phase1

final RelayClient client = RelayClient.builder().url(URL).build();
final TextContent content = TextContent.builder().content("hello world").build();
final SecKey key = new SecKey(ByteString.decodeHex("PRIVATE KEY"));
final Event event = content.sign(key); 
client.send(event);

Examples

Follow the example folder.

Phase2

Why did I choose this database?

I would consider using Cockroach DB as a database for Nostr, as Cockroach supports a cluster approach (3 quorums), which has a higher availability than traditional single-point databases.

When a quorum fails, one of the remaining replicas will get a range lease (with write capability) to ensure data consistency, and the CP feature is more suitable for Nostr's community platform.

If the number of events to be stored will be huge, what would you do to scale the database?

The Event data(Table) is the most likely to grow in a short time, and may need to be filtered for queries (where by conditionA and...).

I'm not sure about the partitioning and sharding capabilities of Cockroach, but I would first consider sharding the data using time slicing, such as creating a table for events in the same month, or another table for data in the same year, but I'm not sure, it might have to consider with the limitations of Relay.

Phase3

Please provide a short writeup of why you chose a particular database for Phase 3, answering the following questions:

Why did you choose this database? Is it the same or different database as the one you used in Phase 2? Why is it the same or a different one?

I would still choose Cockroach database, as in Phase2, Nostr Gateway is still a kind of Relay in terms of responsibility, but it needs to provide a special way to read Event data according to certain conditions, which may come from web screen operations (this situation is different from Relay reply messages). Therefore, special attention needs to be paid to the querying and indexing of data.

Range looks like Cockroach's sharding function, which generates shards for tables and indexes according to the Range Size setting. These shards can also be set on the corresponding Region or Zone. The shards can be set on the corresponding Region or Zone to achieve better Load balance.

If the number of events to be stored will be huge, what would you do to scale the database?

The Gateway itself is responsible for writing and reading, not so much for updating data and data locking, but for reading data, it does not need to be real-time, but may make queries based on the information the client wants. Therefore, we should avoid full table scan as much as possible when querying data.

  • If we handle the information of Event, it will be a big table.
  • Since Cockroach supports distributed database very well, it is possible to multiply the load by simply adding Nodes by horizontal scaling.
  • With Cockroach's suggestion of using UUID ID design, it is necessary to avoid hot spots.
  • Index design, I haven't thought about it clear, but I think users may want to query data against some tags, filters or subscribeId, which may be the focus of index design.

Phase4

Please provide a short writeup of why you chose a particular Queue or Event Stream system for Phase 4, answering the following questions:

Why did you choose this solution?

I chose RabbitMQ, also implemented Direct Exchange and Consumer, in the code I designed a RESTful API ( RelayController) to allow the Event aggregator to dynamically subscribe to Relays.

1 Realy -> 1 Direct Exchange -> 1 Consumer

When the API is called, the Exchange and Consumer will be bound dynamically. This design is simpler, easier to maintain, and can ensure message sequencing like Kafka. But the problem is that when Relay sends a large number of Event events, it may cause RabbitMQ to fail, which may require some monitoring mechanism to deal with.

If the number of events to be stored will be huge, what would you do to scale the database?

I don't think there is much of a bottleneck in writing data to Cockroach DB, as there are no complicated update and delete transactions.

I think reading data is the biggest problem, I have tried several NOSTR sites myself and almost all of them are very slow in reading information.

It depends on where the actual bottleneck is, it may be the table shard, or index problem, I will consider these ways

  1. used the primary key of UUID to avoid hot spot.
  2. index should set the size, not more than 1Mb schema-design-indexes
  3. use as-of-system-time to reduce transaction conflicts by time interval query

Phase5

My goal was to package to GKE via Jib, however some problems occurred that I haven't had time to deal with.

Why did you choose these 5 metrics?

  1. GKE Pods System metrics(CPU, Memory, etc.), need to know when to scale up.
  2. Rabbit Management can watch Queue metrics(message counter, consumer rate, delivery rate, etc.), I haven't handle the Dead Letter, it will lead to a dead cycle.
  3. Cockroach console can watch (TTL, SQL, Runtime, Statement, etc.) need to know the Relay event read bottleneck.

What kind of errors or issues do you expect them to help you monitor?

Initially, I would like to be able to see the number of messages per Relay, when, how many messages on average, the average message size, or median statistics. I am using 1 Ralay 1 Exchange 1 Consumer design in RabbitMQ, might this be possible? For Relay with a particularly large number of Events I think we need to do some processing to avoid overloading the Single RabbitMQ, so we need some Alter and logging.

If you had more time, what other tools would you use or metrics would you instrument and why?

These metrics are scattered on different architectures, and I need a platform like Grafana that can consolidate this information to provide more insight.

Demo

CockroachDB

RabbitMq Management

distrise's People

Contributors

jerry80409 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.