lemmynet / activitypub-federation-rust Goto Github PK
View Code? Open in Web Editor NEWHigh-level Rust library for the Activitypub protocol
License: GNU Affero General Public License v3.0
High-level Rust library for the Activitypub protocol
License: GNU Affero General Public License v3.0
Some projects like Mastodon make webfinger mandatory, so it would definitely be useful to have support directly in the library. The code is also relatively abstract (at least fetch part), so it wouldnt need many changes.
https://github.com/LemmyNet/lemmy/blob/main/crates/apub/src/fetcher/webfinger.rs
https://github.com/LemmyNet/lemmy/blob/main/crates/routes/src/webfinger.rs
When I try to ObjectId::<ApUser>::parse(user.ap_id.into())
, I get this
ObjectId::<ApUser>::parse(user.ap_id.into())
| -------------------------------- ^^^^^^^^^^^^^^^ the trait `From<Infallible>` is not implemented for `url::ParseError`
| |
| required by a bound introduced by this call
user.ap_id
have String type and contains ActivityPub ID of object.
The url::Url
type is awkward for our use because it has domain as an optional field, and when logging it prints individual url components instead of the full url as a string. We should add a wrapper type like the following to workaround these issues.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Url(url::Url);
impl Deref for Url {
type Target = url::Url;
fn deref(&self) -> &Self::Target {
&self.0
}
}
impl Display for Url {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
self.0.display(f)
}
}
impl Url {
pub fn parse(input: &str) -> Result<Url, url::ParseError> {
url::Url::parse(input).map(Url)
}
pub fn domain(&self) -> &str {
self.0.domain().expect("has domain")
}
pub fn into_inner(self) -> Self {
self
}
}
Posting this as an issue instead of PR because now is not a good time for breaking changes.
It would be nice if the library could automatically generate required HTTP routes, because these require a lot of boilerplate code. This would at minimum be a GET endpoint for each actor/object, and a POST inbox endpoint for actors. There are also other endpoints such as outbox, followers and shared inbox.
Examples:
This would require passing format strings to the library, so that it knows at which path the routes should be created.
I'm trying to use eyre
with this crate and getting some really gnarly generic trait bound errors. It appears it's because functions like receive_activity
and the like are coupled to ensuring that something can implement the anyhow::Error
type.
I propose we should refactor this out, and rely solely on thiserror
for errors within this crate and provide some way of bringing your own errors for library consumers.
Let me know if you're open to this and I can have a crack
The documentation does not mention the features, so I am left wondering what it is for exactly.
The queue for sending outgoing activities has an in-memory storage for activities that failed to be delivered and need to be retried later, when the target server is hopefully reachable again. As its only in memory, this storage is gone after a restart or crash. It would be good to provide a config option for storing it on disk, eg in a sled database.
There are already traits for ApubObject
, ApubActor
and ActivityHandler
for ease of handling the respective Activitypub types. There should be a similar trait for collections, which helps with pagination in particular.
Hey! I'm trying out this crate to setup an activitypub server.
I am not too familiar with Rust and tried to implement the Object trait for my Sqlite implementation. However, I found it unclear how to best implement "DataType". My assumption is from the example
type DataType = DbConnection;
That it should be some connection to the sql database akin to SqliteConnection
in Diesel. However, SqliteConnection does not implement Clone+Share+Send, so it's not obvious to me what I should implement there for it to work. Any pointers? I looked at the examples and they seem to do it without a real database implementation, so it didn't help me too much.
As mentioned here, part of the difficulty of implementing AP with static languages is that a given JSON-LD property might be a reference (IRI) to an object, or the object itself. The crate docs say:
This means we don’t use json-ld which Activitypub is based on, but that doesn’t cause any problems in practice.
I'm curious how Lemmy sidesteps this issue. Following along the examples in the docs it seems to assume a given property will always return either an IRI or an object, which doesn't handle the full functionality of AP/JSON-LD according to the specs. Am I missing something here?
I see on this line that if the send fails due to a connection error, it does not get retried:
But there are cases where it should definitely retry. For example, today lemmy.ml is overloaded so many subscribe operations are not going through:
lemmy-lemmy-1 | 2023-06-12T14:51:09.410929Z DEBUG Worker{worker.id=867f1e83-24de-4363-a3ff-ee742f67c9b6 worker.queue=default worker.operation.id=9f4792c7-e83a-44f7-9536-236e48b82cd8 worker.operation.name=process}:Job{execution_id=b12ecf19-efec-4438-a352-9612c22d384b job.id=272231c7-803d-4f7f-a35d-d7800e0fe08b job.name=SendActivityTask}: activitypub_federation::core::activity_queue: Sending https://lemmy.vepta.org/activities/follow/f4dc629c-297a-47b2-bb75-cac9ece6d1f4 to https://lemmy.ml/inbox
lemmy-lemmy-1 | 2023-06-12T14:51:19.414603Z INFO Worker{worker.id=867f1e83-24de-4363-a3ff-ee742f67c9b6 worker.queue=default worker.operation.id=9f4792c7-e83a-44f7-9536-236e48b82cd8 worker.operation.name=process}:Job{execution_id=b12ecf19-efec-4438-a352-9612c22d384b job.id=272231c7-803d-4f7f-a35d-d7800e0fe08b job.name=SendActivityTask}: activitypub_federation::core::activity_queue: Unable to connect to https://lemmy.ml/inbox, aborting task https://lemmy.vepta.org/activities/follow/f4dc629c-297a-47b2-bb75-cac9ece6d1f4: Request error: error sending request for url (https://lemmy.ml/inbox): operation timed out
If the operation is not retried, the user will never be subscribed unless they manually re-subscribe.
Using &'static str
for errors is not a good idea. Requires breaking change.
Right now this lambda:
activitypub-federation-rust/src/http_signatures.rs
Lines 105 to 113 in af92e0d
is passed to here: https://git.asonix.dog/asonix/http-signature-normalization/src/commit/85bbcb0bae2f976d08dfddad3d5050ffae149732/reqwest/src/lib.rs#L249
which seems to run it in the main async runtime. But signing is a potentially expensive operation and should probably run inside tokio::spawn. I've messaged asonix, the author of the library to confirm since if true it's probably not fixable here since the lambda is synchronous.
These result in a lot of breaking changes which arent easy to resolve.
https://github.com/hyperium/hyper/blob/master/CHANGELOG.md#v100-2023-11-15
https://github.com/hyperium/http/blob/master/CHANGELOG.md#100-november-15-2023
Some fediverse software runs in "secure mode", requiring HTTP signatures for most endpoints.
Mastodon has it as an option: https://docs.joinmastodon.org/admin/config/#authorized_fetch
In GoToSocial it is mandatory: https://github.com/superseriousbusiness/gotosocial#safety--security-features
I took the live_federation example and added some code to dereference an account on my GoToSocial instance:
let user_id = ObjectId::<DbUser>::parse("https://gts.djs.to/@darrin")?;
let data = config.to_request_data();
let user = user_id.dereference(&data).await?;
This fails with the log:
[INFO activitypub_federation::fetch] Fetching remote object https://gts.djs.to/@darrin
[WARN background_jobs_actix] Dropping manager, tearing down workers
[WARN background_jobs_actix] Stopping and joining arbiter
[WARN background_jobs_actix] Stopping and joining arbiter
[WARN background_jobs_actix::worker] Not restarting worker, Arbiter is dead
[WARN background_jobs_actix::worker] Not restarting worker, Arbiter is dead
.. repeated many times ..
Error: Error(Other errors which are not explicitly handled)
My server logs http request wasn't signed or http signature was invalid
.
I think this will involve an API change, as the private key of the requesting actor is required to sign the request.
This will likely improve performance
activitypub-federation-rust/src/activity_queue.rs
Lines 150 to 162 in 6ac6e2d
If you create an application with tokio runtime which depends on this library, it crashes at startup with the following error:
thread 'main' panicked at 'System is not running', /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.8.0/src/system.rs:120:211
stack backtrace:
0: std::panicking::begin_panic
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/panicking.rs:611:12
1: actix_rt::system::System::current::{{closure}}
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.8.0/src/system.rs:120:21
2: std::thread::local::LocalKey<T>::try_with
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/thread/local.rs:446:16
3: std::thread::local::LocalKey<T>::with
at /rustc/84c898d65adf2f39a5a98507f1fe0ce10a2b8dbc/library/std/src/thread/local.rs:422:9
4: actix_rt::system::System::current
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.8.0/src/system.rs:118:9
5: actix_rt::arbiter::Arbiter::with_tokio_rt
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.8.0/src/arbiter.rs:115:19
6: actix_rt::arbiter::Arbiter::new
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/actix-rt-2.8.0/src/arbiter.rs:101:9
7: background_jobs_actix::Manager::new
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/background-jobs-actix-0.13.1/src/lib.rs:178:31
8: background_jobs_actix::WorkerConfig<State,background_jobs_actix::Managed>::start
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/background-jobs-actix-0.13.1/src/lib.rs:357:9
9: activitypub_federation::activity_queue::create_activity_queue
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/activitypub_federation-0.4.0/src/activity_queue.rs:229:5
10: activitypub_federation::config::FederationConfigBuilder<T>::build
at /home/felix/.cargo/registry/src/github.com-1ecc6299db9ec823/activitypub_federation-0.4.0/src/config.rs:179:21
...
This is caused by the background-jobs
crate. It would be good to get rid of this requirement, so that any async runtime can be used.
Hi,
When trying to implement some basic activitypub functionality, I found it very hard to know where the errors came from.
For example, when implementing a basic actor and using webfinger_resolve_actor
, I just get a WebFingerResolveFailed
error, while my code generated more error information than that.
I suggest replacing the current Error
value with:
pub enum ErrorKind {
NotFound,
RequestLimit,
ResponseBodyLimit,
ObjectDeleted,
UrlVerificationError(&'static str),
ActivityBodyDigestInvalid,
ActivitySignatureInvalid,
WebfingerResolveFailed,
/// see Error::source
Other // Here we remove the current anyhow::Error
}
pub struct Error {
kind: Kind,
source: Option<anyhow::Error>,
}
And implement the Error
trait on top of the struct Error
. That way using std::Error::source
the full error information can be found.
This line uses generic object fetching to resolve fingers
https://github.com/LemmyNet/activitypub-federation-rust/blob/af92e0d53204a2ccd13bc0db3c58de24bff646bf/src/fetch/webfinger.rs#L39C26-L39C43
And generic object fetcher only accepts activity+json
Which is contrary to the Webfinger standard which allows jrd+json or xrd+xml:
https://datatracker.ietf.org/doc/html/rfc7033#page-11
This leads to issues when federating with Akkoma and any other software which uses standard webfingers with another MIME type:
LemmyNet/lemmy#3222
I'd like to raise some discussion about moving from OpenSSL to Rustls.
Moving away from OpenSSL would make the project more portable, since you don't have to fiddle with OpenSSL's installation. Additionally, Rustls makes smart use of the type state pattern and rusts move semantics to improve security.
Hi, I played a bit with the example code and replaced axum with actix-web in the live example. The user can be fetched but no matter what fediverse software I try federating with (mastodon, glitch, firefish, akkoma) it all fails at verification of the incoming activity. The culprit code is here
Hi I am trying to use this with axum instead of axtix-web.
However it seems that into_apub
from ApubObject
causes:
note: future is not `Send` as it awaits another future which is not `Send`
--> src/instance.rs:133:18
|
133 | let person = user.into_apub(&data).await.unwrap();
| ^^^^^^^^^^^^^^^^^^^^^ await occurs here on type `std::pin::Pin<std::boxed::Box<dyn std::future::Future<Output = std::result::Result<objects::person::Person, error::Error>>>>`, which is not `Send`
I assume this is because the function takes an reference to the Arc instead of taking the arc itself which somewhat breaks the Arc's purpose.
Is there any way this could be made axum compatible?
My (broken) Code for this is available here (it also wont compile due to some other reasons as I am in between adding basic functions but got stuck on this issue): https://gitlab.com/MTRNord/pixelrust/-/blob/dc50d4bd729eecc23d972d6d3d92426edb280911/src/instance.rs#L132
Sorry for doing this issue in here as I have no other way to reach you :)
It seems like synapse fails to fetch the alias, and it's probably due to the error mentioned in https://federationtester.matrix.org/#asonix.dog Any chance this could be fixed, or another alias be added? :)
Would make it slightly easier to use the library.
https://github.com/LemmyNet/activitypub-federation-rust/blob/main/src/core/inbox.rs#L19
At the same time, receive_activity() should take web::Bytes
directly, and handle the conversion + log as string.
From what i have learned, Lemmy is creating http connections that are short-lived (only 10 minutes?) before they expire or some similar scheme related to signing/encryption?
In real-word servers, I'm seeing logging of:
WARN Error encountered while processing the incoming HTTP request: lemmy_server::root_span_builder: Header is expired
0: lemmy_server::root_span_builder::HTTP request
with http.method=POST http.scheme="http" http.host=mylemmyinstance.com http.target=/inbox otel.kind="server" request_id=453c8a92-7bb5-4b7e-a4ad-212e91167d4e http.status_code=400 otel.status_code="OK"
at src/root_span_builder.rs:16
LemmyError { message: None, inner: Header is expired, context: "SpanTrace" }
Because of proxying by nginx, this message does not give any hint who the sender is. IF the clocks are not set right on one of the peer servers, it could be a major problem and you would have to start logging on the firewall or something to find any hint of who the sender is.
I did tinker around with the code and I at least was able to get Lemmy to log the IP address of the remote server by adding realip_remote_addr to the tracing:
+++ b/src/root_span_builder.rs
@@ -18,6 +18,7 @@ impl RootSpanBuilder for QuieterRootSpanBuilder {
http.method= %request.method(),
http.scheme = request.connection_info().scheme(),
http.host = %request.connection_info().host(),
+ http.realip_remote_addr = request.connection_info().realip_remote_addr(),
MOST IMPORTANT TO ME: is we need to get the word out to the major Lemmy instances to be looking for this message in their error log. Is this why we are seeing significant failures to replicate data between servers? Issues: LemmyNet/lemmy#3101 and LemmyNet/lemmy#3203
http-signature-normalization-0.6.0.crate
is licensed under CSL, which is incompatible with GPL licenses.
See:
https://lynnesbian.space/csl/
This issue prevents packaging Lemmy.
None
None
0.18.0
No response
It seemed logical to have separate methods to verify()
and receive()
data, but in practice it doesnt really help, and even leads to some duplicate code when you need to fetch an object multiple times. So maybe the methods ActivityHandler::verify()
and ApubObject::verify()
should just be removed.
Version 0.14.1 removes the QueueHandle.get_stats() function which is necessary to know how many jobs are currently running and how many are pending, to tell when the number of workers needs to be increased. The function is replaced with some metrics. I am not sure how these new metrics work and the new version doesnt have important changes, so I am leaving it for later (or for someone else).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.