Solution contains 4 main components
- DB getter - service which calls DBs, puts the results together and publishes them in message queue.
- Message queue NATS - decouples DB getter and worker, allows to run multiple workers & getters. NATS also ensures that only one worker is processing published message.
- Worker - here is where business logic lives. It listens to new messages from queue and process them one at the time. Can be easily scaled horizontally.
- Key-value store Redis - this is used to store last checked BlobStorageID.
This may be not very efficient, as I'm far from being SQL queries expert. This is the biggest issue in this solution.
It's pretty resilient as getters dump last checked index to Redis.
You can scale both getter & worker horizontally. Only limitation here is DB resistant for such heavy load of queries.
You'll need docker-compose installed.
- run
make install_deps
to install required python libraries. make setup_env
will turn on docker-compose services (NATS, both databases & redis).make a_mess
will introduce inconsistencies in DB.make look_for_incosistent
will run single worker which will wait for new messages in queue & process them. It prints inconsistent entries to stdout, can be easily dumped to a file.make fetch_from_db
is a service which runs DB queries and publishes BlobRefDTO objects to message queue.
You can run unit tests for business logic via make tests
.