OK, so the "Storing data in the network ... " title made me remember something.
If you transmit a message to Mars, say a rover command sequence, and the outgoing buffer is deleted on the sending side (the original code is preserved, but the transmission-encoded sequence doesn't stick around), then that data, for 20-90 minutes, exists nowhere _except_ space. It's just random-looking electrical fluctuations that are propagating through whatever is out there until it hits a conducting piece of metal millions of miles away and energizes a cap bank enough to be measured by a digital circuit and reconstructed into data.
So, if you calculate the data rate (9600 baud, even), and set up a loopback/echo transmitter on Mars, you could store ~4 MB "in space". If you're using lasers, it's >100x as much.
To my mind the big issue with existing examples of checked exceptions is that the language to talk about the exceptions is woefully inadequate, so it stops you from writing things that would be useful while dealing correctly with exceptions. The go-to example is a map function, which we should be able to declare as throwing anything that might be thrown by its argument. Without that we need to either say that map might throw anything and then handle cases that actually can't happen, suppress/collect exceptions inside map, or suppress/collect errors inside the functions we're passing to map, all of which add boilerplate and some of which add imprecision or incorrectness. It would also be good to be able to state that a function handles some exceptions if they are thrown by its argument. And all of this should be able to be composed arbitrarily. And... somehow not be too complicated. For usability, it should probably also be possible to infer what's thrown for functions that are not part of an external API.
It is easy to make a torrent client, but very hard to make a good torrent client. A very good, or let's say "perfect", one, needs to support multiple transport protocols (TCP, "uTP" aka UDP, "WebTorrent" aka WebRTC), multiple discovery mechanisms (DHT, PEX, HTTP trackers, WebSocket trackers), multiple torrent formats (v1, v2, hybrid), should use the network optimally (max the speed without overloading the network - IIRC some clients measure average packet latency and if it starts going up, put some backpressure), resolve magnet URLs, set up port forwarding, reconfigure firewalls, offer API for *arr stack, be a good netizen (report stats correctly, send packets within the specs, do not spam - otherwise other clients will blocklist your in their code or config), implement many BEPs (mutable torrents are cool), be able to recover from interrupted state based on only the data that's on disk, have configurable downloading order (people want to start playing videos before they finish downloading, so you may want to e.g. download header and footer of each file first), and ideally detect duplicates between torrents (cross-seeding). And then there will be people throwing 2TB+ torrents at it (e.g. TLMC) to benchmark it, and saying your client is "literally unusable" if it doesn't handle it.
So, building a "perfect" torrent client from the ground up is a daunting task. But the "good" news is that nobody built such a "perfect" client just yet, so if you have some spare months of your time, you can take a shot at it. Or even better yet, open the issue tracker for one of popular clients or libraries, and add one of the missing features from the list above.
> They require either segmented stacks, a precise GC
Which is _exactly_ what stackless coroutines and promises do in a very roundabout manner. Some languages are moving toward annotations to automate chaining, but the problem with having to explicitly annotate coroutines is that you no longer have (or can have) first-class functions; at best you now have multiple classes of functions that only interoperate seamlessly with their own kind, which is the opposite of first-class functions. Plus it's much slower than just using a traditional call stack.
Implementing transparently growable or moveable stacks can be difficult, yes. Solving C ABI FFI issues is a headache. And languages that stack-allocate variables but cannot easily move them are in a real pickle. Though, they can do as Go and only stack-allocate variables that don't have their address taken, and in any event that only applies to C++ and Rust. There's no excuse for all the other modern languages. Languages like Python and JavaScript don't have stackful coroutines because of short-sighted implementation decisions that are now too costly to revisit. Similarly, Perl 6 doesn't officially have them because they prematurely optimized their semantics for targets like the JVM where efficient implementations were thought to be difficult. (Moar VM implements stackful coroutines to support gather/take, which shows that it was simply easier to implement the more powerful construct in order to support the less powerful gather/take construct.)
If it were easy we wouldn't have stackless coroutines at all because they're objectively inferior in every respect, and absent external constraints (beholden to the C stack) can result in less memory usage and fewer wasted CPU cycles in both the common and edge cases. But both PUC Lua and LuaJIT do it properly and are among the fastest interpreted and JIT'd implementations, respectively, so I think the difficulty is exaggerated.
I understand why these other constructs exist, but I still think it's perverse. At some point we should just revisit and revise the underlying platform ABI so we can get to a place where implementing stackful coroutines is easier for all languages.
For example, the very same debugging information you might add to improve stack traces can be used by implementations to help, say, move objects. Make that mandatory as part of the ABI and a lot of cool things become possible, including easy reflection in compiled languages. DWARF is heavyweight and complex, but Solaris (and now FreeBSD and OpenBSD) support something called Compact C Type Format (CTF) for light-weight type descriptions, which shows how system ABIs could usefully evolve.
Newer languages shouldn't be tying themselves to incidental semantics from 40 years ago. Rather, they should be doing what hardware and software engineers were doing 40 years ago when they defined the primary semantics--properly abstract call stacks/call state into a first-class construct (i.e. thread of control), while simultaneously pushing the implementation details down so they can be performant.
There's nothing about stackless coroutines that means you can't have good stack traces. For example, C# already does it, at least to some degree.
Stackful coroutines are clearly a viable tool, but they don't work in all use cases. They require either segmented stacks, a precise GC, or memory usage comparable to kernel threads. They are tricky to implement correctly on Windows. Etc.
In my ideal world, we'd have stackless coroutines with great debugger support everywhere, with languages free to experiment with the syntax- explicit suspension points, implicit suspension points, effect polymorphism to make it look like you're yielding across nested function calls, etc...
If you can be on the ground in the path of totality, I highly recommend it. Especially go someplace with more nature. The birds and bugs will change their behavior as totality happens. Very cool experience that really tickles basic human instincts.
Believe it or not, back in the 90s we thought (on the whole) that web browsers were for browsing hypertext documents. Not for replacing the operating system. There's a reason JS started out limited to basic scripting functionality for wiring up e.g. on-click handlers and form validation. That it grew into something else is not indicative of any design fault in JS (tho it has plenty), but with the use it was shoehorned into. The browser as delivery mechanism for the types of things you're talking about is... not what Tim Berners Lee or even Marc Andreesen had in mind?
I have very mixed feelings about WASM. There is a large... hype-and-novelty screen held up in front of it right now.
There are many Bad Things about treating the web browser as nothing more than a viewport for whatever UI designer and SWE language-of-the-wek fantasy is going around. Especially when we get into things like accessibility, screen readers, etc.
As for the people treating WASM as the universal VM system outside the browser... Yeah, been down that road 30 years ago, that's what the JVM was supposed to be? But I understand that's not "cool" now, so...
I had 3rd-degree arthritis in my knee that I developed after 4 surgeries due to a football(soccer) injury.
4 years ago I could only walk with crouches due to pain.
Last month I competed in 2 half-marathons.
I was able to get much much better by doing a lot of PRP (platelet-rich plasma) injections over past years and a single SVF (stromal vascular fraction) injection 1.5 years ago.
My MRI shows that cartilage has grown back significantly (not fully, but I no longer have a gaping hole in it).
Ten years ago was two whole technological generations ago in the implementation of OpenJDK's GCs. OpenJDK now has a maximum pause time of under 1ms for heaps up to 16TB.
> That's a weird way to phrase "in the cooperative style, scheduling points are explicit, and in the preemptive style, the burden is on the developer to get it right".
Because it's the opposite. In the preemptive style, the mutual exclusion assumption -- i.e. the relevant correctness assumption -- is explicit, while in the cooperative style the correctness assumption is implicit and does not compose well. I.e. any addition of a scheduling point requires re-examining the implicit assumptions in all transitive callers.
> Preemptive multitasking gives the runtime more capabilities and makes it easier to implement because you can just assume the user has to deal with anything that your implicit rescheduling causes.
Preemptive multitasking is far, far harder to implement (that's why virtual threads took so long) and it provides superior composability. Cooperative scheduling is very easy to implement (often all it requires is just a change to the frontend compiler, which is why languages that have no support of the backend have to choose it) but it places the burden of examining implicit assumptions on the user.
> After decades of legacy, none of that is feasible in Java of course, but let's not pretend that preemptive scheduling is objectively "preferable".
Java is no different in this case from C# or Rust. They all support threading. But after significant research we've decided to spend years on offering the better approach. It looks like C# has now realised that, while hard, it may be possible for them, too, so they're giving it a go.
If we look at true clean-slate choices, the two languages that are most identified with concurrency as their core -- Erlang and Go -- have also chosen preemptive scheduling. Languages that didn't have either done so due to legacy constraints (JS), lack of control over the backend (Kotlin), complex/expensive allocation in low-level languages (C++ and Rust), or possibly because it's simply much easier (C# I guess).
I’m still bummed that Python took this direction. Maybe introducing new keywords into the language for event loop concurrency was Python’s way of satisfying “explicit is better than implicit” but i can’s shake the feeling that callback passing and generator coroutines are a fad that is complex enough to occupy the imagination of a generation of programmers while offering little benefit compared to green threads.
I don't understand why modern languages use "async" to do cooperative multitasking. Maybe someone can enlighten me.
My (probably incorrect) understanding is that "async" arose from Javascript. It arose because pure event driven code is error prone and hard to get right compared to linear (stack based) code. The usual solution is threads, be they real or lightweight (aka cooperative multitasking) - but Javascript doesn't have threads and never will. Compared to pure event driven code using zillions of objects to save state, the pseudo stack based async solution is indeed a blessing.
async is in effect a poor mans emulation of lightweight threads. It comes at the cost of needing language syntax to support it ("async" and "await") and it creates different colours of code, ie code that can't be mixed. The end result is parallel implementations of lots of libraries, leading to the situation the article and above comment both moan about.
Lightweight threads / green threads achieve the same outcome as async, but without the downsides. No language extensions, no coloured code, all existing API's remain backward compatible. Javascript didn't have a choice, but why any language that does have a choice would use the async solution has me completely baffled. It's not like we didn't have numerous examples such as Elixir or Go, yet Rust went with async anyway.
If Go with its green threads is a step down from Rust in performance, Erlang is two or three steps down from Go. If you step down your performance needs, a lot of these problems melt away.
Most programmers should indeed do that. There's no need to bring these problems on yourself if you don't actually need them. Personally I harbor a deep suspicion a non-trivial amount of the stress in the Rust ecosystem over async and its details is coming from people who don't actually need the performance they are sacrificing for. (Obviously, there absolutely people who do need that performance and I am 100% not talking about them.) But it's hard to tell, because they don't exactly admit that's what they're doing if you ask, or, at least, not until many years and grey hairs later.
But in the meantime, some language needs to actually solve these problems (better than C++), and since Rust has volunteered for that role, that means that at the limit, the fact that other languages that chose to just take a performance hit don't seem to have these problems doesn't have very many applicable lessons for Rust, at least when it is being used at these maximum performance levels.
Most of the projects that dislike GitHub's review UI want the functional equivalent of `git range-diff`. Code review systems like Phabricator and Gerrit basically revolve around this as basis of thinking about diffs, their evolution, and how code review progresses.
You want to write 3 patches to a project, that are committed in series, based off of `X`
A ---> B ---> C
X---/
Let's say A cleans up some code, getting it ready for B; it is not mandatory but was just naturally done as part of the change; maybe it's 50 lines. B then adds 500 lines of new code to be reviewed. Finally, C integrates the new code from B; maybe you migrate something to use it, changing a few internal API calls. So C might be a diff of only 5 or 10 lines.
The code reviewer will want to start by reading A, and leaving comments on A. And so on for B, and so on for C. Let's say they leave code review comments on each commit. So now you have a set of things to do.
GitHub encourages you to add a new set of patches on top of the previous 3, so you might publish add commits on top:
A ---> B ---> C --- > D ---> E ---> ...
X---/
Where each change after C incrementally addresses review comments.
In contrast, tools like Gerrit and Phabricator say instead that you should publish new versions of A, B, and C, wholesale. So now we have a new series of 3 patches:
A' ---> B' ---> C'
X---/
This branch might exist in parallel to your old one. So your full commit graph might look like this, now:
A ---> B ---> C
/
/-A' ---> B' ---> C'
X---/
So there's the original series of changes A,B,C, and the new series of changes that respond to all the comments. Think of that as "version 1" of the series and then "version 2"
Now here's the question: how does the code reviewer know that you addressed their comments? Answer: they need to do a range diff between the original version 1 series and the version 2 series:
A ---> B ---> C
| | |
d(A,A') d(B,B') d(C,C')
| | |
A' ---> B' ---> C'
Where d(x,y) = diff(x,y). You're looking at the diff between the two versions of one patch. So instead you view the changes between version 1 of A, and version 2 of A. And so on and so forth for all 3 patches.
This is very useful for example because B might be 500 lines, but responding to review comments may only take 50 lines of fixes. It would be very annoying to re-read the entire 500 line patch, as opposed to just the 50 line incremental patch. This has very big effects as the review cycle goes forth.
People mostly like this review style because it keeps needless "fix review comment" commits out of the history and it "localizes" the unit of code review to each individual patch rather than the whole aggregate. Note that the final version of the series A,B,C will just contain those 3 logical changes, not 3 logical changes + 1 dozen fixup changes.
This not only makes changes more "dense", it improves the ability to navigate the history, and do things like `git blame`; and it means you don't commit things like "fix failing test" which would break bisection.
Note that most of the systems that implement the above review style do not literally use `git range-diff` in their implementation; rather, git range-diff is simply an implementation of this idea that you should review each version of a patch as a diff from the previous version. The tools themselves have their own lifecycle, patch management, APIs, etc that are wholly different from Git's.
Finally, there are lots of things GitHub's UX is just slacking on, functionality aside. You can't comment "anywhere" in a code review, just on changed lines. An annoying one I hate is that you can batch review comments but not batch review resolutions; if you leave 5 comments on a diff, you submit them as a batch. But if you resolve 5 comments, you do it one at a time, which is annoying and easy to lose track of. The UX has too many tabs you have to swap between, which is pointless when you could just make the page a little longer and things like Ctrl+F would work better. And so on, and so forth.
Ah yes—as the saying goes: “keep your friends at the Bayes-optimal distance corresponding to your level of confidence in their out-of-distribution behavior, and your enemies closer”
They use Rust (i like Rust), so they have to somehow convince themselves that their mess of overly abstracted code strewn across 2000 crates is, in fact, very very fast and it's all worth it.
I think people use a language like Rust because they hear its got potential for performance, but its told to them like so: "Rust is blazing fast!".
So they perpetuate that idea, and just hope to god that their program, even if its horribly cache-inefficient, allocates aggressively, context switches constantly, etc. is still fast enough that it feels fast.
Also, of course, on dev machines which often have high end hardware, it probably does feel very fast. "I can't even see any time between input and result, its so fast!", said the programmer running his script on an overclocked i9-9900k with nothing else open
To be clear, nothing about Rust or C or C++ is fast. They allow you to write faster code than most other languages, some more easily than others. All of those you can easily write the slowest code imaginable in. Try doing a bunch of very expensive copies in C# or Java - youd have to go out of your way. In C++ its default behavior.
I love Rust but it is accidentally in the position of trying to serve two disparate camps of people. Developers who just want a modern ML-ish language with good tooling and some actual lessons learned from PL theory, and developers who need near-total control over the hardware but are tired of working with C & C++ and manually solving decades-old problems with memory safety. The former are a very large audience, but have to deal with requirements imposed by the latter which are irrelevant for their use case.
There are projects that try to do this, but you do have to use them.
LLVM is one. That's pretty successful.
The JVM is another. That's very successful, there are lots of languages that build on the JVM to reduce their tooling costs. Java, Kotlin and Clojure are examples of that.
GraalVM is another. If you use their Truffle framework then you get JIT compilation, debugging, tracing, instrumentation, different kinds of interpreters, advanced strings, some LSP support, language interop and FFIs and more.
But what you see, generally, language authors tend to start by hacking together a compiler and standard library starting from a C hello world program. Not that many survey the landscape and look for ways to reduce the costs.
There is no right answer on this front, and it's all about different use cases.
It comes down to I/O-bound vs CPU bound workloads, and to how negatively things like cache evictions and lock contention might affect you. If your thing is an HTTP server talking to an external database with some mild business logic inbetween, and hosted on a shared virtual server, then, yeah, work-stealing and re-using threads at least intuitively makes sense (tho you should always benchmark.)
If you're building a database or similar type of system where high concurrency under load with lots of context switches is going to lead to cache evictions and contention all over the place -- you're going to have a bad time. Thread per core makes immense sense. An async framework itself may not make any sense at all.
But there is no right, dogmatic answer on what "is better." Profile your application.
I've said it before, but I feel like the focus of Rust as a whole is being distorted by a massive influx of web-service type development. I remain unconvinced that Rust is the right language for that kind of work, but it seems to do ok for them, so whatever. But the kind of emphasis it puts on popular discussion of the language, and the kind of crates that get pushed to the forefront right now on the whole reflect this bias. (Which is also the bias of most people employed as SWEs on this forum, etc.)
I bet it was an architecture problem more than a language problem.
I’d be expecting to leave most of that in place and fix the hotspots that are dragging performance down. Maybe some compiled code in certain places.
Almost certainly a detailed analysis would have yielded 20 things that could be tuned.
Posts like this are almost something to be ashamed of “we rewrote an entire subsystem because we wanted to use a language we like”. That’s really failing in your responsibility to the company.
Why does the current incarnation of Silicon Valley have to always be shady? Why can't it say 'our service is X' and then deliver service X, not some who knows what other agenda? A car parts store just sells car parts, they don't change and offer refrigerator parts since they already have a distribution system for parts and then tell their customers 'we have to do this in order to keep selling car parts'. Also, 'we are shifting our required by law in store oil recycling to 'contractors' and if they happen to just be dumping it into the river not our problem, wasn't us, we tried' (narrator: they didn't try).
Silicon Valley deserves the hate and bad reputation they now have. They've acted in bad faith for too long while fostering an air of 'trying to make the world better'. I'm so glad they've lost their sales tax exemption and I hope they get kneecapped on every other 'advantage' they have (i.e. 'we don't have employees, we are empowering independent contractors who happen to look just like employees').
America had so much optimism for tech and now they hate it, maybe reflect for a bit on why.
There is one major problem with -- or, rather, cost to -- zero-cost abstractions: they almost always introduce accidental complexity into the programming language originating in the technical inner workings of the compiler and how it generates code (we can say they do a bad job abstracting the compiler), and more than that, they almost always make this accidental complexity viral.
There is a theoretical reason for that. The optimal performance offered by those constructs is almost always a form of specialization, AKA partial evaluation: something that is known statically by the programmer to be true is communicated to the compiler so that it can exploit that knowledge to generate optimal machine code. But that static knowledge percolates through the call stack, especially if the compiler wants to verify — usually through some form of type-checking — that the assertion about that knowledge is, indeed, true. If it is not verified, the compiler can generate incorrect code.
Here is an example from C++ (a contrived one):
Suppose we want to write a subroutine that computes a value based on two arguments:
enum kind { left, right };
int foo(kind k, int x) { return k == left ? do_left(x) : do_right(x); }
And here are some use-sites:
int bar(kind k) { return foo(k, random_int()); }
int baz() { return foo(left, random_int()); }
int qux() { return foo(random_kind(), random_int()); }
The branch on the kind in foo will represent some runtime cost that we deem to be too expensive. To make that “zero cost”, we require the kind to be known statically (and we assume that, indeed, this will be known statically in many callsites). In this contrived example, the compiler will likely inline foo into the caller and eliminate the branch when the caller is baz, and maybe in bar, too, if it is inlined into its caller, but let’s assume the case is more complicated, or we don’t trust the compiler, or that foo is in a shared library, or that foo is a virtual method, so we specialize with a "zero-cost abstraction":
template<kind k> foo(int x) { return k == left ? left(x) : right(x); }
This would immediately require us to change all callsites. In the case of baz we will call foo<left>, in qux we will need to introduce the runtime branch, and in bar, we will need to propagate the zero-cost abstraction up the stack by changing the signature to template<kind k> bar(), which will employ the type system to enforce the zero-cosiness.
You see this pattern appear everywhere with these zero cost abstractions (e.g. async/await, although in that case it’s not strictly necessary; after all, all subroutines are compiled to state machines, as that is essential to the operation of the callstack — otherwise returning to a caller would not be possible, but this requires the compiler to know exactly how the callstack is implemented on a particular platform, and that increases implementation costs).
So a technical decision related to machine-code optimization now becomes part of the code, and in a very intrusive manner, even though the abstraction — the algorithm in foo — has not changed. This is the very definition of accidental complexity. Doing that change at all use sites, all to support a local change, in a large codebase is esepcially painful; it's impossible when foo, or bar, is part of a public API, as it's a breaking change -- all due to some local optimization. Even APIs become infected with accidental complexity, all thanks to zero-cost abstractions!
What is the alternative? JIT compilation! But it has its own tradeoffs... A JIT can perform much more aggressive specialization for several of reasons: 1. it can specialize speculatively and deoptimize if it was wrong; 2. it can specialize across shared-library calls, as shared libraries are compiled only to the intermediate representation, prior to JITting, and 3. it relies on a size-limited dynamic code-cache, which prevents the code-size explosion we'd get if we tried to specialize aggressively AOT; when the code cache fills, it can decide to deoptimize low-priority routines. The speculative optimizations performed by a JIT address the theoretical issue with specialization: a JIT can perform a specialization even if it cannot decisively prove that the information is, indeed, known statically (this is automatic partial evaluation).
A JIT will, therefore, automatically specialize on a per-use-site basis; where possible, it will elide the branch; if not, it will do it. It will even speculate: if at one use site (after inlining) it has so far only encountered `left` it will elide the branch, and will deoptimize if later proven wrong (it may need to introduce a guard, which, in this contrived example will negate the cost of the branch, but in more complex cases it would be a win; also there are ways to introduce cost-free guards -- e.g. by introducing reads from special addresses that will cause segmentation faults if the guard trips, a fault which is caught; OpenJDK's HotSpot does this for some kinds of guards).
For this reason, JITs also solve the trait problem on a per-use-site basis. A callsite that in practice only ever encounters a particular implementation — a monomorphic callsite — would become cost-free (by devirtualization and inlining), and those that don’t — megamorphic callsites — won’t.
So a JIT can give is the same “cost-freedom” without changing the abstraction and introducing accidental complexity. It, therefore, allows for more general abstractions that hide, rather than expose, accidental complexity. JITs have many other benefits, such as allowing runtime debugging/tracing "at full speed" but those are for a separate discussion.
Of course, a JIT comes with its own costs. For one, those automatic optimizations, while more effective than those possible with AOT compilation, are not deterministic — we cannot be sure that the JIT would actually perform them. It adds a warmup time, which can be significant for short-lived programs. It adds RAM overhead by making the runtime more complicated. Finally, it consumes more energy.
There's a similar tradeoff for tracing GC vs. precise monitoring of ownership and lifetime (Rust uses reference-counting GC, which is generally less effective than tracing, in cases where ownership and lifetime are not statically determined), but this comment is already too long.
All of these make JITs less suitable for domains that require absolute control and better determinism (you won't get perfect determinism with Rust on most platforms due to kernel/CPU effects, and not if you rely on its refcounting GC, which is, indeed more deterministic than tracing GC, but not entirely), or are designed to run in RAM- and/or energy-constrained environments — the precise domains that Rust targets. But in all other domains, I think that cost-free abstractions are a pretty big disadvantage compared to a good JIT -- maybe not every cost-free abstraction (some tracking of ownership/lifetime can be helpful even when there is a good tracing GC), but certainly the philosophy of always striving for them. A JIT replaces the zero-cost abstraction philosophy, with a zero-cost use philosophy -- where the use of a very general abstraction can be made cost-free, it (usually) will be (or close to it); when it isn't -- it won't, but the programmer doesn't need to select a different abstraction for some technical, "accidental", reason. That JITs so efficiently compile a much wider variety of languages than AOT compilers can also demonstrate how well they abstract the compiler (or, rather, the languages that employ them do).
So we have two radically different philosohies, both are very well suited for their respective problem domain, and neither is generally superior to the other. Moreover, it's usually easy to tell to which of these domains an application belongs. That's why I like both C++/Rust and Java.
Of course it is -- when you have continuations, which each and every imperative language has, although usually without explicit control over them. With explicit access to
continuations, everything expressible with async is expressible without it. Or, more precisely, whether something is "async" or not is not a feature of a certain piece of code but of a certain execution. Essentially, it means whether some computation may want to suspend itself and resume execution later. That you must express that capability as a type-system enforced property of some syntactic element -- a subroutine -- is accidental complexity.
To get an intuitive feel for why that is so, try to imagine that the OS were able to give you threads with virtually no overhead, and you'll see how anything that's expressible with async would be expressible without it. Over the years we've so internalized the fact that threads are expensive that we forgot it's an implementation detail that could actually be fixed.
Computations in Rust can suspend themselves without declaring themselves to be async: that's exactly what they do when they perform blocking IO. They only need to declare themselves async when they want a particular treatment from the Rust compiler and not use the continuation service offered by the runtime, which in Rust's case is always the OS.
I've worked for both Facebook and Google so can make informed comments on this with two exceptions: Buck2 came after I left and I'm honestly not sure what sapling is. Is it some Mercurial-like re-implementation a bit like how Google's Piper is a re-implementation of Perforce?
The tl;dr is that Google's developer tooling and ifnrastructure is superior in almost every way. Examples:
- When I started at FB we used Nuclide, an internal fork of the Atom editor. While I was there it was replaced by VS Code. It's better but honestly they should've built their tooling off of Jetbrains products. Jetbrains make IDEs. VS Code is a text editor like vim or emacs. There's a massive difference;
- Buck should've been killed and replaced by Bazel. I can't speak to Buck2 but this seems like a pointless investment;
- Thrift should be killed and replaced with gRPC/Protobuf. Same deal;
- FB's code search is just grep. It's literally called BigGrep. Grep can get you pretty far but it's just not the same as something with semantic understanding. Google has codesearch, which does understand code, and it's miles ahead. This has all sorts of weird side effects too, like Hack code at FB can't use namespaces or type aliasing because then grep wouldn't be able to find it. When there were name conflicts you'd sometimes be forced to rename something to get something to compile;
- Tupperware (FB's container system) is a pale shadow of Borg;
- Pushing www code at FB is a very good experience overall. You commit something and it'll get pushed to production possibly within an hour or two or, at busier times, it might take to the next day. This requires no release process or manual build. It's basically automatic; Google's build and release process tends to be way more onerous;
- The big achillees heel in FB's www code is that it is one giant binary. There's no dependency declaration at all. This means there's an automatic system to detect if your change affects other things and that process often fails. This leads to trunk getting broken. A lot.
- Because of the above problem there is a system to determine what tests to run for a given commit. This is partially about what the affected components are but also longer-running tests aren't run-on-commit and often those tests would've found the problem. There is no way to say "if this file is modified, run this test". That's a huge problem;
- FB has a consistent system for running experiements and having features behind flags (ie gatekeeper). This wasn't the case when I was at Google. It may well have changed;
- Creating a UI for an internal tool or a new page is incredibly easy at FB. There are standard components with the correct styling for everything. If you want to write an internal tool, you can start at 9am and have it in production by noon if it's not terribly complicated;
- The build system for C++ at FB is, well, trash. For Buck (and Bazel), the build system creates a DAG of the build artifacts to decide what to build. FB C++ might take 2 minutes just to load the DAG before it builds anything. This is essentially instant at Google because a lot of infrastructure has been built to solve this problem. This is a combination of SrcFS and ObjFS. Incremental builds at FB to run tests doesn't really work as a workflow;
- All non-www builds at FB are local builds. Nothing at Google (on Google3 at least) is built locally, including mobile apps. This is way faster because of build artifact cachcing and you have beefier build machines.
- There tends to be less choices as to what to use for FB code (eg storage systems). I consider this largely a good thing. You will typically find 5 different way of doing anything at Google and then need to consider why. You will often find different teams solving the same problem in slightly different ways or even the exact same way.
- There are people at FB who work on system-wide refactors (eg Web security, storage). These people can often only commit their diffs that might touch thousnads of files on weekends.
- A lot of generated code is committed at FB that isn't at Google. This exacerbates the previous problem. FB has a ton of partially and completely generated files that mean a change to the generating code has a massive effect. At Google, for example, the protobuf generated code is genearted at build time and isn't in the repo.
There's probably more but that's what comes to mind.
> Yup. A lot of the consumer protection regulations are being totally ignored with impunity. It’s nice to see one retailer held to account. But I don’t think it foretells a larger trend.
I'm Australian. I saw the headline, and my first thought was "I wonder if this is some US retailer playing fast a loose with the truth in Australia". And yup, it was.
Australia is not the USA. You don't ignore consumer laws in Australia with impunity like you can in the USA.
The best example I've seen was HP not honouring their warranties on laptops. The ACCC (Australian Competition and Consumer Commission - a government agency) forced hp.com to put a 1/2 page banner at the top of every web page saying they didn't honour warranties, and spelling out the details of what they did. It was in your face, it was ugly, and it was startling sight for any hp.com visitor.
I've only saw this happen twice. The other time was a local retailer called kogan.com. That's probably because that demonstration of what the government will do to companies that don't comply with the law (I'm sure it hurt more than any fine) there is now near 100% compliance with what the companies doing business in Australia say they will do on their web site, and what they actually do.
Another classic example is Australian ISP's are forced to advertise what your minimum download speed at peak hour will be. And it's enforced. Can you believe that? There is a place in the world where download speeds are what they say on the box, and "unlimited downloads" really means "download as much as you like".
Australia is a living demonstration that you don't have to live in a post-truth society. It is possible to pass laws to say you have to spell out in straightforward, non-misleading terms on the box what is in the box. And it's possible to enforce them in imaginative ways that remarkably effective, yet don't involve fines so you don't get companies evaluating the tradeoff of paying the fine versus lying and keeping the loot. And yes, the end result makes life simpler fairer for everyone - including the retailers.
I'm going to do a very poor attempt at explaining my feelings over this. For a long time I've idolized Teddy Roosevelt. Parts of his story just speak to me:
* He was sickly but just worked very hard to compensate and overcome it to live an outdoors-centric life. I suffer from some undiagnosable(word?) arthritis, I work hard to still do my favorite activities (backpacking, hunting and fishing).
* He was extremely well learned, thoughtful, and articulate. He kept a detailed journal. I admire those traits and try to emulate them. I pursued a PhD, I keep a journal, I frequently write to relatives and friends. I try to be thoughtful and articulate.
* He was a legendary hunter (as I mentioned I love hunting), and conservationist. He is partly responsible for the National Parks, which I enjoy every free day (I live near Yellowstone and routinely pass under his arch).
* He was empathetic and progressive for his time [0], despite his unshaking nationalism. He didn't seem to fall for toxic tropes of conservatism (although those political terms don't translate to today, Teddy is universally celebrated by the right). I believe strongly in what I consider to be core unshaking tenets of American national identity, but I also try hard to be empathetic and make sure no one is left behind. I don't believe the answer is exclusion of outgroups, which I think is a common conservative trope today. I have a lot of respect for Teddy (with some important caveats) [1].
* He held himself, and those in government to a higher standard [2]. He believed he was a public servant, and fought tirelessly against corruption. There is nothing that makes me angrier than corruption and entitlement in political office today.
I also learned, a couple years ago, he probably would have despised me, had our paths crossed. I'm mostly Italian-American, who he openly viewed as mostly criminals [3], and even went so far as to applaud a lynching of Italian Americans in New Orleans [4].
I think it's important that I learned this. I'm glad it wasn't erased or ignored or left out of the record. I learned a few things from it. I learned (well, I already knew, but it was reinforced) that no one is perfect. No one. And that perfection is the enemy of self-improvement, it is the enemy of progress, and it is the enemy of good. There's no part of me that I'm aware of that's worse for having held TR in such high esteem and tried to emulate the great things I found about him. In fact, his example has improved my life in a variety of ways, particularly my mental health and fortitude, seeing him as an example. Sure, it made me sad to learn that fact about him, but in terms of material affect? It had no change. Words, thoughts, opinions, ideas, are necessarily flawed and incomplete as a direct result of the human condition. Pretending they are not creates two things:
1) An impossible standard
2) A false sense of security and satisfaction
I'm better at some things than TR. I don't have Italian Americans (or any minority group for that matter, my understanding was that he had similar feelings about natives). That alone is a major victory. By not having to draw myself against perfect examples, I can foster some feeling of accomplishment, but I can also learn that my idols were flawed and that even if I am flawed in many ways, I can hopefully still do some good. That's an important motivator.
A perfect example is one I will never live up to, and in that case, why even try?
There are two major Indian classical music traditions, Hindustani (North Indian) and Carnatic (South Indian). The link is focused on the former. I'd like to add some (general, not just rhythmic) notes on the latter.
1. Both derive from the Sāmaveda, and took their modern forms beginning from the 15th century CE. What would eventually become Hindustani music took on Persian and Arabic influences, whereas Carnatic music—as the name suggests—is mostly limited to South India and therefore relatively insulated from the cultural changes in the North, and is considered closer to ancient Indian music.
2. Perhaps the most important difference between Indian and Western classical music is that the former is relative, not absolute. The main artiste picks a comfortable śruti, or tonic: males typically choose B to D, solo instrumentalists typically tune their instruments to D - F, and female artistes choose F# to A (and an octave higher). This is just a rule of thumb; some instrumentalists (flautists, for instance) prefer C. Note that I refer to only the note; the precise frequency/octave is irrelevant insofar as defining śruti is concerned. Once the main artiste has selected their śruti, all other accompanying instruments are tuned to the same. This is why ICM notation is solfège-based, staff notation does not work for ICM, and why Indian classical musicians have a taṃbūrā (could be electronic, too) on stage.
3. The ICM solfège is called swara. There are seven of these, called sa, ri, ga, ma, pa, dha, ni, just like in the West (do-re-mi-fa-so-la-ti). Similarly, they expand to twelve when accounting for semitones.
4. The basic melodic framework for ICM is the rāga; loosely put, this defines a certain scale that a composition, or parts of a composition, or some improvisation is set to. rāgas also define the gamakas, or ornaments for their scales; this is fundamental to ICM. There exist notationally equal scales with and without gamaka, but they are considered completely different rāgas.
5. There are loose analogues between ICM rāgas and Western modes. For instance, the Ionian mode is equivalent to Śankarābharaṇaṃ in Carnatic music, or Bilāval in Hindustani music.
6. Carnatic music rāga classification has an interesting tree structure. There are 72 'roots', called mēḷakartā, which have a complete scale of 7 notes ascending, and 7 notes descending. Delete some notes, or reorder them, or introduce one or more notes from another scale, and you get a veritable sea of janya rāgas.
7. Carnatic music has a few classifications for its rhythmic patterns. The suladi sapta tāla system is most commonly taught to beginners. In this, there are three angas (or parts): laghu, dhrutam, and anudhrutam. laghus can have 3, 4, 5, 7, or 9 beats (or akṣara), called jātis. There are seven arrangements (ergo, sapta) of angas under this system, and with five jātis, there emerge 35 talas with different akṣara sums for each. The typical 8-akṣara rhythm (again, similar to Western music) is called ādi tāla, or catusra jāti triputa talā.
My wife and I have a beautiful daughter about to turn 2. But she wouldn't have existed if it weren't for IVF (in vitro fertilization). Had to stab her stomach with needles for a while to make hormones to mass produce eggs and then they took my sperm to put into her eggs and see which ones took. Five viable pairings happened. Three were put into her and out came the one kid. Other two are in cryo for future use, and hopefully we can try them too one day soon. I was hesitant about IVF when I was younger, but now I recommend it to anyone who's having difficulty conceiving. The doctor showed us my sperm in a petri dish. They were tiny in number and so lazy to move compared to a video that the doctor indicated to be more healthy sperm. Whatever lifestyle changes I need to make to make my sperm better, not sure it can be done overnight and my wife was 38 at the time of conception.
Kid is beautiful, stubborn, independently-minded, cute, all of it. Would not trade her for the world.
If you're having difficulty conceiving, please consider IVF. We tried for 3 years the natural way with no dice.
My partner and I recently embarked on a journey to replace as many products in our lives that contain endocrine disrupting chemicals (EDC's) as we could. It's _really_ hard.
You have to carefully read product labels, 99.9% of products don't contain any info about EDC's, you have to specially seek out ones that are labeled as phthalate-free, BPA-free, etc.
Biggest offenders are in the kitchen and bathroom. Shampoo, conditioner, deodorant, perfume, are big offenders. There are special versions of these products you can find but they don't usually work as well. Most things that are scented have phthalates. Pretty much any food that comes in a flexible plastic container, most dairy and eggs have it as well (dairy because of the flexible plastic tubing used when pumping milk).
If you transmit a message to Mars, say a rover command sequence, and the outgoing buffer is deleted on the sending side (the original code is preserved, but the transmission-encoded sequence doesn't stick around), then that data, for 20-90 minutes, exists nowhere _except_ space. It's just random-looking electrical fluctuations that are propagating through whatever is out there until it hits a conducting piece of metal millions of miles away and energizes a cap bank enough to be measured by a digital circuit and reconstructed into data.
So, if you calculate the data rate (9600 baud, even), and set up a loopback/echo transmitter on Mars, you could store ~4 MB "in space". If you're using lasers, it's >100x as much.