(Disclaimer: I'm one of the authors of TernFS and while we evaluated Ceph I am n...

mgrandl · 2025-09-18T15:47:24 1758210444

There are definitely insanely large Ceph deployments. I have seen hundreds of PBs in production myself. Also your usecase sounds like something that should be quite manageable for Ceph to handle due to limited metadata activity, which tends to be the main painpoint with CephFS.

rostayob · 2025-09-18T16:11:29 1758211889

I'm not fully up to date since we looked into this a few years ago, at the time the CERN deployments of Ceph were cited as particularly large examples and they topped out at ~30PB.

Also note that when I say "single deployment" I mean that the full storage capacity is not subdivided in any way (i.e. there are no "zones" or "realms" or similar concepts). We wanted this to be the case after experiencing situations where we had significant overhead due to having to rebalance different storage buckets (albeit with a different piece of software, not Ceph).

If there are EB-scale Ceph deployments I'd love to hear more about them.

mrngm · 2025-09-19T11:54:52 1758282892

Ceph has opt-in telemetry since a couple of years. This dashboard[0] panel suggests there are about 4-5 clusters (that send telemetry) within the 32-64 PiB range.

It would be really interesting to see larger clusters join in on their telemetry as well.

[0] https://telemetry-public.ceph.com/d/ZFYuv1qWz/telemetry?orgI...

mdaniel · 2025-09-19T14:37:09 1758292629

That was interesting, thank you. Other folks may also enjoy its sibling dashboard showing the used capacity https://telemetry-public.ceph.com/d/ZFYuv1qWz/telemetry?orgI...

mgrandl · 2025-09-18T16:41:59 1758213719

There are much larger Ceph clusters, but they are enterprise owned and not really publicly talked about. Sadly I can’t share what I personally worked on.

rostayob · 2025-09-18T17:05:41 1758215141

The question is whether there are single Ceph deployments are that large. I believe Hetzner uses Ceph for its cloud offering, and that's probably very large, but I'd imagine that no single tenant is storing hundreds of PBs in it. So it's very easy to shard across many Ceph instances. In our use-case we have a single tenant which stores 100s of PBs (and soon EBs).

ttfvjktesd · 2025-09-18T19:18:35 1758223115

Digital Ocean is also using Ceph[1]. I think these cloud providers could easily have 100s of PBs Clusters at their size, but it's not public information.

Even smaller company's (< 500 employees) in today's big data collection age often have more than 1 PB of total data in their enterprise pool. Hosters like Digital Ocean hosts thousands of these companies.

I do think that Ceph will hit performance issues at that size and going into the EB range will likely require code changes.

My best guess would be that Hetzner, Digital Ocean and similar, maintain their own internal fork of Ceph and have customizations that tightly addresses their particular needs.

[1]: https://www.digitalocean.com/blog/why-we-chose-ceph-to-build...

kachapopopow · 2025-09-18T16:00:44 1758211244

Ceph is more of: here's a raw block of data, do whatever the hell you want with it, not really good for immutable data.

mgrandl · 2025-09-18T16:40:15 1758213615

Well sure you would have to enforce immutability at the client side.

kachapopopow · 2025-09-18T18:57:32 1758221852

It's more that it has all the systems to allow mutability which add a lot of overhead when used as an immutable system.

eps · 2025-09-18T18:00:57 1758218457

Last point is an extremely important advantage that is often overlooked and denigrated. But having a complex system that you know inside-out because you made it from scratch pays in gold in the long term.

_jsmh · 2025-09-18T18:23:21 1758219801

Any compression at the filesystem level?

rostayob · 2025-09-18T18:32:59 1758220379

No, we have our custom compressor as well but it's outside the filesystem.