Hacker Newsnew | past | comments | ask | show | jobs | submit | JonChesterfield's commentslogin

That one is called scalar evolution, llvm abbreviates it as SCEV. The implementation is relatively complicated.

I think this works. A subset of S3's API does look like a CRDT. Metadata can go in sqlite. Compiles to a static binary easily.

I've spent a mostly pleasant day seeing whether I can reasonably use garage + rclone as a replacement for NFS and the answer appears to be yes. Not really a recommended thing to do. Garage setup was trivial, somewhat reminiscent of wireguard. Rclone setup was a nuisance, accumulated a lot of arguments to get latency down and I think the 1.6 in trixie is buggy.

Each node has rclone's fuse mount layer on it with garage as the backing store. Writes are slow and a bit async, debugging shows that to be wholly my fault for putting rclone in front of it. Reads are fast, whether pretending to be a filesystem or not.

Yep, I think I'm sold. There will be better use cases for this than replacing NFS. Thanks for sharing :)


Deleted comment pointed out that LICENSE.pdf is a screenshot from hashicorp. That's pretty weird, raised an issue for it.

Issue closed! Thanks! I modified the license type to be AGPL vs. BSL.

Corrupts data on power loss according to their own docs. Like what you get outside of data centers. Not reliable then.

Losing a node is a regular occurrence, and a scenario for which Garage has been designed.

The assumption Garage makes, which is well-documented, is that of 3 replica nodes, only 1 will be in a crash-like situation at any time. With 1 crashed node, the cluster is still fully functional. With 2 crashed nodes, the cluster is unavailable until at least one additional node is recovered, but no data is lost.

In other words, Garage makes a very precise promise to its users, which is fully respected. Database corruption upon power loss enters in the definition of a "crash state", similarly to a node just being offline due to an internet connection loss. We recommend making metadata snapshots so that recovery of a crashed node is faster and simpler, but it's not required per se: Garage can always start over from an empty database and recover data from the remaining copies in the cluster.

To talk more about concrete scenarios: if you have 3 replicas in 3 different physical locations, the assumption of at-most one crashed node is pretty reasonable, it's quite unlikely that 2 of the 3 locations will be offline at the same time. Concerning data corruption on a power loss, the probability to lose power at 3 distant sites at the exact same time with the same data in the write buffers is extremely low, so I'd say in practice it's not a problem.

Of course, this all implies a Garage cluster running with 3-way replication, which everyone should do.


That is a much stronger guarantee than your documentation currently claims. One site falling over and being rebuilt without loss is great. One site losing power, corrupting the local state, then propagating that corruption to the rest of the cluster would not be fine. Different behaviours.

Fair enough, we will work on making the documentation clearer.

I think this is one where the behaviour is obvious to you but not to people first running across the project. In particular, whether power loss could do any of:

- you lose whatever writes to s3 haven't finished yet, if any

- the local node will need to repair itself a bit after rebooting

- the local node is now trashed and will have to copy all data back over

- all the nodes are now trashed and it's restore from backup time

I've been kicking the tyres for a bit and I think it's the happy case in the above, but lots of software out there completely falls apart on crashes so it's not generally a safe assumption. I think the behaviour is sqlite on zfs doesn't care about pulling the power cable out, lmdb is a bit further down the list.


So if you put a 3-way cluster in the same building and they lose power together, then what? Is your data toast?

If I make certain assumptions and you respect them, I will give you certain guarantees. If you don't respect them, I won't guarantee anything. I won't guarantee that your data will be toast either.

If you can't guarantee anything for all the nodes losing power at the same time, that's really bad.

If it's just the write buffer at risk, that's fine. But the chance of overlapping power loss across multiple sites isn't low enough to risk all the existing data.


I disagree that it's bad, it's a choice. You can't protect against everything. The team made calculations and decided that the cost to protect against this very low probability is not worth it. If all the nodes lose power you may have a bigger problem than that

Power outages across big areas are common enough.

It's downright stupid if you build a system that loses all existing data when all nodes go down uncleanly, not even simultaneously but just overlapping. What if you just happen to input a shutdown command the wrong way?

I really hope they meant to just say the write buffer gets lost.


That's why you need to go to other regions, not remain in the same area. Putting all your eggs in one basket (single area) _is_ stupid. Having a single shutdown command for the whole cluster _is_ stupid. Still accepting writes when the system is in a degraded state _is_ stupid. Don't make it sound worse than it actually is just to prove your point.

> Still accepting writes when the system is in a degraded state _is_ stupid.

Again, I'm not concerned for new writes, I'm concerned for all existing data from the previous months and years.

And getting in this situation only takes one out of a wide outage or a bad push that takes down the cluster. Even if that's stupid, it's a common enough stupid that you should never risk your data on the certainty you won't make that mistake.

You can't protect against everything, but you should definitely protect against unclean shutdown.


If it's a common enough occurrence to have _all_ your nodes down at the same time maybe you should reevaluate your deployment choices. The whole point of multi-nodes clustering is that _some_ of the nodes will always be up and running otherwise what you're doing is useless.

Also, garage gives you the possibility to automatically snapshot the metadata, advices on how to do the snapshotting at the filesystem level and to restore that.


All nodes going down doesn't have to be common to make that much data loss a terrible design. It just has to be reasonably possible. And it is. Thinking your nodes will never go down together is hubris. Admitting the risk is being realistic, not something that makes the system useless.

How do filesystem level snapshots work if nodes might get corrupted by power loss? Booting from a snapshot looks exactly the same to a node as booting from a power loss event. Are you implying that it does always recover from power loss and you're defending a flaw it doesn't even have?


It sounds like that's a possibility, but why on earth would you take the time to setup a 3 node cluster of object storage for reliability and ignore one of the key tenants of what makes it reliable?

Per machine. Definitely more than one machine here.

Microsoft pleading poverty doesn't really fly

Nobody's pleading poverty here. It's a reasonable business decision to charge for value, just like the rest of the economy does.

> The docs I upload are ones I'd be OK getting leaked. That also includes code.

That's fortunate as uploading them to a LLM was you leaking them.


"Leaking" is an unauthorised third party getting data; for any cloud data processor, data that is sent to that provider by me (OpenAI, everything stored on Google Docs, all of it), is just a counterparty, not a third party.

And it has to be unauthorised, e.g. the New York Times getting to see my ChatGPT history isn't itself a leak because that's court-ordered and hence authorised, all the >1200 "trusted partners" in GDPR popups if you give consent that's authorised, etc.


Mozilla spend a lot of time telling me I trust them. I don't think that's having the effect they expect.

The paper is annoyingly difficult to locate but the author's implementation is at https://github.com/oliver-giersch/looqueue-rs

Checkout the Nim-loony repo in the paper folder for the pdf.

Ah right, in the nim repo, not the authors one. Contains https://github.com/nim-works/loony/blob/main/papers/GierschE... indeed, thank you

I'm sure there was a period where one couldn't find a 3kW kettle in the UK on power efficiency grounds, one was supposed to run a 2kW one instead to save the planet. But now when I search I find 3kW models again. So either that was a nightmare of some sort or sanity has prevailed.

Chatgpt thinks this was threatened in 2010 then postponed in 2016 then cancelled, which vaguely aligns with my timeline of interest in tea.


I highly doubt that. Electric kettles are just about 100% efficient, and the only difference between a 3kW kettle and a 2kW one is how long it'll take to boil. The total energy consumed will be more-or-less the same.

Are you perhaps conflating it with the EU regulations on vacuum cleaners going in around 2017? As with all EU regulations, this of course resulted in a decent bunch of EU-bashing in UK media by the usual suspects - despite less-power-hungry vacuum cleaners being just as effective as the more power-hungry ones, and power consumption being inflated by manufacturers to market their vacuums, as plenty of people believed that "bigger number = more suck = more better".


Boiling at lower power uses _more_ energy to reach the same temperature. This makes a 2kW kettle on eco grounds especially dumb, yes, but that doesn't preclude people pushing for it.

UK power grid has the Eastenders effect. Where the ending credits of the Eastenders soap signals a large increase in power draw from the grid as people will put on the tea kettle at the end of the show. The grid operators have to dispatch enough power to cover for this.

While the amount of energy used to boil water at 2kW is not significantly different from 3kW (2kW has a tiny amount of more atmospheric losses I think), there is a difference for the impact on the grid. Same energy but more power generating and transmission line capacity needed.


Here's a video showing an engineer at the national grid bringing hydro-electric plants online at the closing credits of a popular soap opera in anticipation of the millions of kettle that are about to be switched on!

https://www.youtube.com/watch?v=slDAvewWfrA


But the more powerful kettle should be slightly more efficient[0] because there is less time for heat to escape from the kettle while the water is being heated.

[0] Energy efficiency at boiling the water. A kettle is always 100% effective at making heat.


You could use vacuum like in a vacuum flask. In fact to my surprise the product seems to already exists although its selling point is how long it holds water hot after boiling and not its efficiency.


People are downvoting you because your story seems crazy, but you’re right (and wrong).

In the early 2010s there were reports that the EU was set to ban 3kW kettles in the anti-EU tabloid press.

The ‘plans’ were discussions, were general (about ‘high energy appliances’, not specifically kettles), and never got beyond the initial discussion stage - according to the same press because of fears they would drive Britons to vote for Brexit, although I’m not sure I believe that. As other commenters say, unlike other appliances that could be made more efficient, kettles are almost 100% efficient already, so the power draw doesn’t really matter. I still have some faith the authorities looking into home appliance energy efficiency would know that.

https://hoaxes.org/weblog/comments/eu_not_banning_kettles

https://www.the-independent.com/news/uk/politics/eu-pauses-p...


A larger heating element is very slightly more efficient due to less heat escaping as the liquid is heated more quickly. Resistive electric heating is always 100% efficient no matter the size of the heating element.

Keep in mind that heat is constantly being transferred between things that are different temperatures, the faster something reaches the set point temperature, the less time there is to lose heat.

https://en.wikipedia.org/wiki/Heat_equation


If we really cared about efficiency of these devices, they'd be insulated.


There was no such thing in the UK. ChatGPT is trained to produce text that fulfills the user's expectations. If you put a prejudiced prompt in, expect a corresponding result.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: