The Fediverse is Inefficient (but that's a good trade-off)

Danterious@lemmy.dbzer0.com · 2 years ago

The Fediverse is Inefficient (but that's a good trade-off)

kopper [they/them]@lemmy.blahaj.zone · edit-2 2 years ago

Eh, I’d make the argument the fediverse is overly inefficient, way more than it has to be. (But that doesn’t seem to be the actual point of the post, instead rehashing the same “distribution = good” thing without bringing anything new to the table)

Here are just a few things that could be fixed without needing to centralize fedi:

A vast majority of instance software will store all old remote non-media data (that could easily be re-fetched when needed) permanently, even if nobody has seen it in years.
If you’re lucky enough to be on instance software that backfills replies (GoToSocial, the Iceshrimp rewrite as of a few days ago, Mastodon in an extremely limited capacity), it will be done slowly and recursively, when much better alternatives that don’t need to deal with easy-to-get-wrong recursion handing are possible. (There is work going on to improve this, but it may take a while for it to land on enough instance software to make a difference)
The obvious thing everyone harps on: Abysmal media caching defaults.
No batching of activities. And relatedly, all sent activities are individually re-signed for each instance on each delivery (to be fair, handling this in a privacy preserving way is hard)
No batching of fetches.
RSA, just to make the above signature situation even worse
Mastodon. Just in general. It’s by far the most heavyweight fedi software I know of, running on a synchronous and poorly threaded tech stack that’s is not very adequate to the fairly IO bound (when not using authorized fetch) and very concurrent AP use case. Running Mastodon for any instance less than ~500 active users is extremely overkill and you’d likely be suited with other, lighter, instance software if you don’t need any of the features that are Mastodon exclusive (which there aren’t that much of).
Pleroma database rot, an exemplar of why the C2S advocates’ model of “store the raw JSON for everything” is a terrible idea (thankfully the C2S model hasn’t taken off enough to be important)

schizo@forum.uncomfortable.business · 2 years ago

A vast majority of instance software will store all old remote non-media data (that could easily be re-fetched when needed) permanently, even if nobody has seen it in years.

Seriously, this is the most befuddling design decision. There’s no reason to cache that data more than like, maybe a week.

Maybe it’s because I’m a sysadmin background type and not a programmer, but the endless obsession that fedi-software has with caching everything at every stop along the route from the poster to the person reading the post is just the most weird thing to me.

kopper [they/them]@lemmy.blahaj.zone · edit-2 2 years ago

A lot of it boils down to most fedi software not being “native” and only having federation designed more-or-less as an afterthought addition on top of a traditional centralized-ish system (even for ones that have federation from the get-go). Meaning you make assumptions like “it’s fine if I deletes the replies of a post if the post gets deleted”.

This, combined with how much data you can’t re-load and have to track as it comes in (e.g. nobody implements the necessary collections to backfill who liked or boosted what from it’s source, so you have to track that implicitly through Like and Announce activities), makes it extremely infeasible to implement while keeping the same user experience. Hell, even reply collections needed to backfill missing replies are a rarity (though a lot more common than the others given Mastodon implements them).

Additionally, people want the same user experience they’re used to in centralized systems, like search actually searching through everyone, globally. This is something I believe AP simply isn’t “intended” for. ATProto, for example, is much better in this specific regard (but comes at it’s own hefty costs, as an implementor).

I don’t blame the implementors for doing things this way. IMO it’s better to partially implement something like AP as an extension, as opposed to diving in head first into being AP-native. The standards are extremely vague and incomplete once you start looking below the shallow surface, and this way at least if a better protocol comes by migration (or multi-protocol federation) won’t be too difficult compared to if your source of truth was the same AS2 data you federated, the way AP intended you to.