Comments (24)
In theory yes, but there is some more into this, for example the TCP sockets. I don't know if there is any abstraction for stuff like that. I only ever saw this in relation to the Tokio or async-std API.
You can also see that in most other database related libs. Most of them explicitly reference Tokio and/or async-std.
I like the way sqlx solved that problem:
https://github.com/launchbadge/sqlx/blob/master/sqlx-rt/src/lib.rs
from aerospike-client-rust.
I started to work on the async version...
Porting the current client is a huge pain and causes a lot of overhead. I spent >25 hours, just trying to fix all the problems caused by references and lifetimes and I'm far from getting this to work.
Async rust really does not like the way variables are stored in the client. There are two ways to solve the problem. First one would be by moving the variables instead of referencing, the second one would be smartpointers.
I will go for the first one since this is the way cleaner and more usable way.
This will cause a lot of refactoring in the lower levels of the client and change the API a lot.
I'm going to do this changes anyway, so now the question is should i do that here or should i fork out?
The changes will partly abandon the low memory profile idea of the client. I will have an eye on good memory usage, but this wont be the priority anymore. The memory usage will probably increase, but i don't see the benefit compared to all the trade-offs in adding a lot of boilerplate to keep it as low as possible. Overall performance is probably the better target (the smartpointer way would probably affect this).
In addition, the sync client will also not be a fully independent one, it will basically just wrap the async one and block. This is a common strategy in rust. It will still require a runtime like Tokio, but that is no problem since there is one in place in most of the projects. I underestimated how many basic functions need to be changed to make them run async. There is nearly no function besides msgpack and buffer writing that works without changes.
I managed to get cluster and net more or less running on async, but as soon as i start with commands, it becomes ugly and hacky.
Short overview of what i will do:
- Crate aerospike-rt -> Runtime abstraction
- Crate aerospike-core -> Async library
- Crate aerospike-sync -> Sync wrapper for the core lib
- Crate aerospike -> The exported API, depending on feature flags
Flags:
- rt-tokio (default on)
- rt-async-std (will follow later)
- rt-actix (will follow later)
- serialization (default on)
Major changes:
- Abandon low memory profile focus
- Timeouts will be removed because they are not supported by async sockets and also not needed in an async env
- Operations and expressions need a lot of refactoring for async and will switch to moving variables instead of referencing
- The client API will have a lot of changes to make it more usable (focus on batch reading and operations)
- The client will probably have buffer caches in a later stage like the golang one has to improve performance when operations are executed multiple times
Since it is closely related to the async runtime, ill try to also implement encryption while building the updated core.
Sorry for skipping the discussion and bringing in facts, but i need progress on this. My last PR is marked ready for review for over 4 months now. If you are ok with this changes, i will do them here with a PR. If you have any comments on this, please share!
from aerospike-client-rust.
Sorry for responding so late. I postponed chipping in to study the rust ecosystem a bit, but it seems that it's been months and hasn't materialized!
- I believe we should have an async implementation, and then provide a thin compatibility layer for the old sync API on top of it, while deprecating it. I see no point to having sync data API in this day and age.
- Here is where I'm not sure. I'd rather have a design with pluggable async backends, but I don't know how viable that is. Any ideas?
from aerospike-client-rust.
I support deprecating the sync API. That makes sense.
I'm also not sure about point 2. Making the client async is already a lot of work, supporting 2 different async platforms means feature flagging the code or splitting it into different crates. Thats a lot of maintaining overhead.
There is probably not much benefit from building this fully modular since async-std and Tokio are the only two really spread runtimes.
I would be fine with both ways, but i would only take care of the base and the Tokio part.
When Tokio is done, we probably got most use-cases covered.
Having the sync and async code base mixed up will end in a mess, so how would you like to separate that?
I would think of either forking to build it clean and maintain both repos for some time or moving one of the APIs to a sub package. "aerospike::sync::Client or aerospike::async::Client" for example.
What do you think about that?
from aerospike-client-rust.
shouldn't the async client use the abstractions in futures
so it's runtime agnostic?
from aerospike-client-rust.
The sqlx wrapper for Tokio/async-std reminds me of the small abstraction layer we have in the Aerospike C client to support libuv, libev, and libevent for async operations. If it's not clear whether either Tokio or async-std will be the dominant API for async Rust, then it might be worthwhile to support both.
from aerospike-client-rust.
The way rust async design and ecosystem has evolved leads me to believe there may not even be just one dominant async library in the future, so it is important for us to design something in the spirit of the original philosophy. At the first glance and on the surface, sqlx implementation seems like a good direction to me as well.
from aerospike-client-rust.
I think we probably can take a lot of stuff from the sqlx abstraction and actix-rt. That means the abstraction will not be a big part of the work.
But the client will need a lot of restructuring.
I would go for this way:
Project structure:
- Crate aerospike-core-> Everything thats shared between the codebases, like the encoding functions.
- Crate aerospike-rt -> The internal async abstraction lib
- Crate aerospike-async -> The async codebase
- Crate aerospike-sync -> The sync codebase
- Crate aerospike -> Reexporter for the 2 above, depending on the feature flags
The async/sync crate could also be a single one, but i would split that for better overview and structure.
Essentially they will be the same, just that the sync one will be deprecated and not use the rt lib.
Feature flags for actix-rt, tokio and async-std. actix-rt is just another wrapper lib for async-std and Tokio, but its common to only have this one and no explicit Tokio runtime. So most projects also support that one. Its nearly no overhead since the imports are nearly identical to Tokio.
from aerospike-client-rust.
I've been working to support the new server v5.6 features into the client, and as part of that - since it would be a breaking change anyway - I was going to remove a few lifetimes. I thought I'd be done in couple of weeks, so it would make sense for you to wait for my changes to get in. In the mean time, maybe we should talk about the changes first so we are on the same page. My changes are focused on removing many of the &str
in the library and turning them into String. This will not have much, if any impact on the memory usage.
I did not know you had a PR open, I thought you just had a branch. If it passes the tests, we can get it merged ASAP.
from aerospike-client-rust.
Thanks for the response! Can you give a more exact list on what you are implementing?
While clicking through the base of the Java client, i found many missing parts.
My refactoring is currently mainly in the net and cluster crates. Im far from getting any kind of command to work.
Async rust requires all structs to be Send + Sync + Sized. Now we could do this like its done for the Client, but that unsafe block is a bad practice. The only way to resolve this is removing everything that is not thread safe.
from aerospike-client-rust.
I'm trying to implement the partition scans so that the tests pass. The old scan protocol is deprecated, so the client can not run any scan/queries. I also have to implement the new auth protocol.
I have nothing against changing the client to get it working with async, I'd just prefer to do it in phases, so that the async job becomes just about the async instead of a huge changeset that interlocks the whole span of changes.
from aerospike-client-rust.
Totally. I did not know that you started to work on the client again.
I wanted to do a load of side features in the same move, because i would otherwise have to port "deprecated" and outdated code to async, so i can remove it again in a later stage to update it.
New Auth protocol and Rack aware are probably the "deepst" ones in the client.
from aerospike-client-rust.
Please have a look at my #108 PR. If you see anything i missed, please tell me.
Its a stupid 1:1 port from old sync to ne async. No functionality changes.
The only part i dont really understand yet is the threadpool. I would like to get rid of that, since it makes not much sense in async. The only part that prevents me is this "scoped" pools. Can you explain more why the client needs them instead of simple run and die threads?
from aerospike-client-rust.
The only part that prevents me is this "scoped" pools. Can you explain more why the client needs them instead of simple run and die threads?
ca9940b: Does this help?
from aerospike-client-rust.
Unscoped threads are a good use case for executing queries accross
multiple nodes because the query threads continue running even after the
result set has been passed back to the caller.
Can you explain this further? Why does the client keep them? Is there any technical reason or is it just to skip the overhead of spawning new threads everytime?
However in the case of
the batch read operation, all threads used to fetch records from
different nodes need to complete before the results are passed back to
the caller.
So the point of the scoped ones is basically just to await until all of them are finished before returning anything?
from aerospike-client-rust.
The main difference between the scoped and unscoped threads is in the lifetimes of the data that you can pass to the thread to process. For unscoped threads, you either have to pass ownership to the thread or you can pass references to data with 'static
lifetimes. You can't pass references to any data with non-static lifetimes, since you don't know how long the thread is going to run, so it might outlive any non-static lifetimes. For scoped threads, you can pass references to data with non-static lifetimes, as long as its guaranteed that the lifetime of the data outlives the lifetime of the scope
.
We use that for batch reads. All the batch records are held in a single vector. All the batch jobs (one per cluster node) hold a shared reference to this structure, but each has a unique set of offsets its responsible for. (Since each record is owned by exactly one node.) All the batch jobs are executed in parallel. Once all jobs are complete, the same batch records vector is passed back to the caller.
Probably could be done without the scoped threads. You'd have to slice up the batch records into smaller vectors which can be passed to each batch job. Once a job is done it passes ownership back. Then you just have to reassemble all the individual pieces back into a single result vector in the correct order.
I haven't used async/await in Rust yet, so I don't know how this would work in that case.
from aerospike-client-rust.
Im trying to implement that splitting right now.
There is one point that brings in a little complexity. How important is it to keep the order in the return vec?
Right now, the order of batch records is the same that the user gave in. Does the returned one have to keep this order or can it be "unordered"?
Right now, the client will first get the nodes for each batch read and then add the index of the element to the node vec.
Instead, i would like to directly add the batch record instead of its index. That would do the slicing part but then i give up the original ordering.
from aerospike-client-rust.
I think all Aerospike clients have always kept the order of the results the same, so for consistency the Rust client should probably do the same.
If we decide to break compatibility with other clients, then the batch function should probably just take IntoIter
instead of a Vec
and return some collection that doesnโt imply order like a HashSet
.
from aerospike-client-rust.
We can extend the BatchRead struct with an index field. Then we can rebuild a new vec at the end. That would probably work.
Accepting IntoIter might be a problem because iters are mutable references. I have not yet managed to send such a reference over multiple threads without a lot of overhead and cloning.
However, this will add overhead and slow down the client, since i will have to move around mutable vectors to do that.
If that bloat is fine, ill implement it. Right now its more or less ordered by node.
from aerospike-client-rust.
I think we should keep the current interface that takes a Vec
and returns a Vec
with records in the same order. @khaf thoughts?
from aerospike-client-rust.
Did you already have time to look into the async PR?
Whats the status on partition scan and the new auth? We would like to update our infrastructure but the client is preventing that right now.
from aerospike-client-rust.
Sorry for pinging again, but i need at least an answer on the second question because otherwise i will have to do it myself.
I waited for over a month now. This is starting to destroy our time plan. @jhecking @khaf
from aerospike-client-rust.
@jonas32 Sorry I was distracted on other projects and @jhecking has very limited time to invest on this project. I'm back at the Rust client full time for the next few weeks. Let me finish a few minor fixes this week and I'll get back to you early next week.
from aerospike-client-rust.
Hello guys, can we expect async client to be released anytime soon? I can see there's a PR open for it, and it seems to be at testing stage
from aerospike-client-rust.
Related Issues (20)
- FilterExpression in Query policy HOT 6
- migrate `MapWriteMode` to `MapWriteFlags`
- `Parameter error` when filtering on large i64 values HOT 5
- possible improvements HOT 6
- New Features
- Issue with Float serialization for Values HOT 1
- WaitGroup explicitly poisoned on batch_get
- Aerospike Version 6 Functionality HOT 5
- Implement serde::Serialize and serde::Deserialize for aerospike::Value and aerospike::Record HOT 1
- Implement std::convert::TryFrom<aerospike::Value> for each variant HOT 1
- support bool data type HOT 5
- `impl Stream` for `RecordSet`
- add missing policy and option fields to query and scan
- Is this still supported? HOT 1
- make error type compatible with `anyhow::Error` HOT 6
- Transaction commit and rollback HOT 1
- derive `Clone` for `Client` HOT 2
- how to access client's `Cluster`? HOT 4
- add missing role/privilege api's to `AdminCommand`
- Batch Operate HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aerospike-client-rust.