Hi Thib, welcome around
I’ve stumbled on your blog post in my Mastodon feed, and I feel like you’re hitting the point, especially on the matter of high availability and the bus factor. Thank you for sharing your thoughts
The first issue that CHATONS addresses pretty well in my opinion is the scaling: having dozens of hosters serving different services sounds effective to me, when it comes to split the burden of maintaining services though multiple entities with their own governance.
Hosting all services required to satisfy one’s need (let’s say: mailing, PeerTube, Mastodon, Nextcloud and various other tools) sounds very hard when there’s only a few people working on it.
With CHATONS, our organization (as a CHATONS member) can host some services depending on our capabilities and our available time (mailing, Nextcloud…) and when someone asks us a service we don’t have, we’re happy to redirect them to another hoster we trust (another CHATONS member).
The main caveat: the end user needs to create an account per CHATONS they are using. This could be solved with inter-CHATONS SSO, but we’re not there yet.
The second issue CHATONS may address on the long term is the sharing and pooling of resources : for off-site monitoring, shared cache between Mastodon instances, off-site (encrypted) backups, which lowers the technical and economic requirements to set up a « production-grade » environment with redundancy and backup. Software solutions and protocols (Garage, S3, or even NFS) helps in achieving this challenge, but we still lack of global cooperation between CHATONS to make it happen.
Finally, it seems like you are emphasizing on high availability and downtime during updates, which is indeed a great issue we have with small hosters. Google and co. does have a great uptime, because they have the means to, which sets very high expectations on the service quality.
We often tell our users that they have to accept we can’t give them the same level of availability than Google’s : we’ll have downtimes from time to time, and if it happens unexpectedly during the night, the service may not be restored until noon the next day. And actually, they’re pretty much fine with it. They know we’re small and trying our best, and most of them seems to accept to lower their expectations in order to regain control over their data.
If we were hosting critical infrastructure such as medical records, we certainly would have enough money to pay someone to ensure night shifts. But it isn’t the case, and it’s okay.
I’d add another point: you wrote about sustainability of the software dependencies of the hoster, though the hoster’s sustainability itself is a big deal too. We have self hosters in CHATONS throwing in the towel every year, by lacking of time and/or motivation, and thus asking their users to switch to another host before closing their services. Hopefully, some members the collective can take over some of their services by achieving data migration, as it was the case when Framasoft closed half of their services.