I had to read your earlier comment a couple of times, and I think I understand! ...

I had to read your earlier comment a couple of times, and I think I understand! When you said object store, I was thinking something like S3, which wouldn't be sufficient for the semantics you outline.

I don't know the internals of Kafka too well, so I'm surprised it doesn't already do what you suggest. An LSM tree system would be a more natural fit in my mind than fixed-sized segments. I wasn't thinking of compaction at all — I had to read a bit about why Kafka compaction wasn't just about deleting log entries beyond the retention horizon.

I suspect that in any Kafka-type application, one of the main challenges is balancing the storage, access performance and consistency of "old" and "new" objects. The head of the stream tends to be where contention is, everything else is read-only and "baked" with indexes and other mechanisms that make range searches quick, and can be heavily cached with long TTLs.

The reason I mentioned external data storage is that if the data never mutates, but it is the log structure which is mutable, then your log is just a sequence of metadata:

  ["123456", "s3://mybucket/xyz1"]
  ["123457", "s3://mybucket/xyz2"]
  ["123458", "s3://mybucket/xyz3"]

Then you can optimize for the structural aspect of the database — you can have very fast data store handling the sequencing of the data and the querying of ranges and so on — and simply offload the physical storage of the log entry data to something hyperoptimized to take care of large blobs efficiently.

I've not thought about what the overhead of storing many very small objects in something like S3 would be. Like if you asked for a range of log entries from one hour back and it's 1 million objects, you can certainly parallelize the fetching efficiently, but you'd also be incurring 1 million S3 requests and doing quite a lot of HTTP traffic (nothing that a local LRU cache couldn't help you with, but still).

So it's probably worth merging them — so you could have a two-level system where newly inserted objects lived in a "fresh" store and then got slowly merged into bigger chunks that would be offloaded to a "frozen" store.

I wouldn't mind sitting down and trying to build something like this in Go using Badger.