I swear the only the people who care about Python types are on Hacker News comments. I've never actually worked with or met someone who cared so much about it, and the ones that care at all seem just fine with type hints.
The people we happen to work with is an incredibly biased sample set of all software engineers.
As an example, almost everyone I’ve worked with in my career likes using macOS and Linux. But there are entire software engineering sub communities who stick to windows. For them, macOS is a quaint toy.
If you’ve never met or worked with people who care about typing, I think that says more about your workplace and coworkers than anything. I’ve worked with plenty of engineers who consider dynamic typing to be abhorrent. Especially at places like FAANG.
Long before typescript, before nodejs, before even “JavaScript the good parts”, Google wrote their own JavaScript compiler called Closure. The compiler is written in Java. It could do many things - but as far as I can tell, the main purpose of the compiler was to add types to JavaScript. Why? Because googlers would rather write a compiler from scratch than use a dynamically typed language. I know it was used to make the the early versions of Gmail. It may still be in use to this day.
The article wasn't about the outage happening, it was about the amount of time it took to even discover what the problem was. Seems logical to assume that could be because there aren't many people left who know how all the systems connect.
> Seems logical to assume that could be because there aren't many people left who know how all the systems connect.
It's only logical presupposing a lot of other conditions, each of which is worthy of healthy skepticism. And even then, it's only a hypothesis. You need evidence to go from "this could have contributed to the problem" to "this caused the problem."
Based on what little is given in the article, it seems to go strongly against this hypothesis. For example it links to multiple past findings that Amazon's notification times need improvement going back to 2017. If something has been a problem for nearly a decade, it's hard to imagine it is a result of any recent personnel changes.
TFA does not establish how many AWS workers have left or been laid off, nonetheless how many of those were actually undesirable losses of highly skilled individuals. Even if we take it on faith that a large number of such individuals were lost, it is another bridge further to claim that there was neither redundancy in that skillset which remained, nor that any vacancies have been left unfilled since.
No evidence is given that indicates that if a more experienced team were working on the problem it would have been identified and resolved faster. The article even states something to the opposite effect:
> AWS is very, very good at infrastructure. You can tell this is a true statement by the fact that a single one of their 38 regions going down (albeit a very important region!) causes this kind of attention, as opposed to it being "just another Monday outage." At AWS's scale, all of their issues are complex; this isn't going to be a simple issue that someone should have caught, just because they've already hit similar issues years ago and ironed out the kinks in their resilience story.
Indeed, the article doesn't even provide evidence that the response was unreasonably slow. No comparison to similar outages either from AWS in the past, before the hypothecated brain drain, nor from competitors. Note that the author has no idea what the problem actually was, or what AWS had to do to diagnose the issue.
It's the most plausible, fact-based guess, beating other competing theories.
Understaffing and absences would clearly lead to delayed incident response, but such an obvious negligence and breach of contract would have been avoided by a responsible cloud provider, ensuring supposedly adequate people on duty.
An exceptionally challenging problem is unlikely to be enough to cause so much fumbling because, regardless of the complex mistakes behind it, a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses) that AWS should have in place.
AWS engineers being formerly competent but currently stupid, without organizational issues, might be explained by brain damage. "RTO" might have caused collective chronic poisoning, e.g. lead in drinking water, but I doubt Amazon is so cheap.
> An exceptionally challenging problem is unlikely to be enough to cause so much fumbling because, regardless of the complex mistakes behind it, a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses) that AWS should have in place
You seem to be misunderstanding the nature of the issue.
The DNS records for DynamoDB's API disappeared. They resolve to a dynamic bunch of IPs that constantly change.
A ton of AWS services that use DynamoDB could no longer do so. Hardcoding IPs wasn't an option. Nor could clients do anything on their side.
> a DNS misunderstanding doesn't have a particularly large "surface area" for diagnostic purposes and it is supposed to be expeditely resolvable by standard means (ordering clients to switch to a good DNS server and immediately use it to obtain good addresses)
Did you consider that DNS might’ve been a symptom? If the DynamoDB DNS records use a health-check, switching DNS servers will not resolve the issue and might make it worse by directing an unusually high volume of traffic at static IPs without autoscaling or fault recovery.
The article describes evidence for a concrete, straightforward organizational decay pattern that can explain a large part of this miserable failure. What's "self-serving" about such a theory?
My personal "guess" is that failing to retain knowledge and talent is only one of many components of a well-rounded crisis of bad management and bad company culture that has been eroding Amazon on more fronts than AWS reliability.
What's your theory? Conspiracy within Amazon? Formidable hostile hackers? Epic bad luck? Something even more movie-plot-like? Do you care about making sense of events in general?
We've witnessed someone repeatedly shoot themselves in the foot a few months ago. It is indeed a guess that it may cause their current foot pain, but it is a rather safe one.
Twice I've had to deal with outages where the root cause took a long time to find because there were several distinct root causes interacting in ways that made it difficult or impossible to reproduce the problem in an isolated way, or to even reason about the problem until we started figuring out that there were multiple unrelated root causes. All other outages I've dealt with were the source where experienced engineers and institutional knowledge were sufficient to quickly find the cause and fix it.
Which is to say: it's entirely possible that the inferences drawn by TFA are just wrong. And it's also possible that TFA is wrong but also right to express concern with how Amazon manages talent.
It's about the time between the announcements about finding the cause. I find that to be thin evidence. There are far too many alternate explanations. It's not even that I find the idea to be implausible, but I don't think the article's doom-saying confidence level is warranted.
This is what I always do. Rather than go directly from the card reader or camera into Photos or Lightroom, I copy the files onto an SSD, and then bring them in from the SSD. The entire process goes faster.
I also want to point out that I've seen similar corruption in the past, only in Lightroom. The culprit ended up being hardware, not software. Specifically, the card reader's USB cable. I've actually had two of these cables fail on different readers. On the most recent one, I replaced it with a nicer Micro B to USB C cable, and haven't had an issue.
I haven't had actual corruption but had imports take an excessive long time or fail to complete in Lightroom because of bad USB cables or (I think) bad USB jack.
Generally I'm frustrated with the state of USB. Bad cables are all over the place and I'm inclined to throw cables out if I have the slightest problem with them. My take is that the import process with Lightroom is fast and reliable if I am using good readers and good cables; it is fine importing photos from my Sony a7iv off a CFExpress card but my Sony a7ii has always been problematic and benefits greatly from taking the memory card out and putting it in a dedicated reader, sometimes I use the second slot in the a7iv.
I use Lightroom, but always with this workflow (copy files from memory card to disk, then use LR to do the import / move / build previews).
If nothing else, it lets you get your card back much more quickly, as a file-system copy runs at ~1500MBps, which makes a difference when importing 50-100GB of photos.
I also don't delete the images off the memory card until they've been backed up from the disk to some additional medium.
I haven't tried super fast memory cards, but with why I have, importing by copying from the card maxes out the card (~100 MB/s). Bonus points for the preview generation starting while the copy is ongoing.
I see extreme irony being lectured about reality by a person who believes I'm shilling LLMs. It doesn't make any sense, are you sure you're not an LLM?
That defeats the whole point of this issue. These uses are fair use, they shouldn't have to license anything. You can't teach music without playing it, Youtube is just allowing rights holders to make claims without any evidence or punishment for being wrong.
Fair use is the problem. It's too ambiguous and as a result lawyers can play the games they're playing. My solution is dirt simple, keeps everybody happy, and quits wasting time pretending we're living in 1998.
Have to agree. I've tried multiple times to replace my. edison FiOS router with different Edgerouters and none of them have been able to compare to the Gigabit speeds I get with the Verizon router. I'm not even using wifi, just want a simple router with a firewall and port forwarding that can compare to my $12/mo one from Verizon. I troubleshooted each for a eeek tweaking hardware acceleration and other knobs, but they couldn't keep up. I think people don't compare and test and just assume it's just as good, but it isn't.
Weird. I got sustained symmetric gigabit speed out of an EdgeRouter Lite when it was loaded with a basic firewall and some port forwarding. At the time I purchased it, the thing cost about ten months of your ISP-provided equipment rental.
Maybe the later EdgeRouters are total trash, but the ERL could (and did) totally handle what you're describing.