What is the difference between code that blocks waiting for I/O and code that performs a lengthy computation? To the runtime or scheduler, these are very different. But to the caller, maybe it does not matter why the code takes a long time to return, only that it does.
Async only solves one of these two cases.
I’d like to draw an analogy here to ⊥ “bottom” in Haskell. It’s used to represent a computation that does not return a value. Why doesn’t it return a value? Maybe because it throws an exception (and bubbles up the stack), or maybe because it’s in an infinite loop, or maybe it’s just in a very long computation that doesn’t terminate by the time the user gets frustrated and interrupts the program. From a certain perspective, sometimes you don’t care why ⊥ doesn’t return, you just care that it doesn’t return.
Same is often true for blocking calls. You often don’t care whether a call is slow because of I/O or whether it is slow because of a long-running computation. Often, you just care whether it is slow or how slow it is.
(And obviously, sometimes you do care about the difference. I just think that the “blocking code is a leaky abstraction” is irreparably faulty, as an argument.)
> To the caller of that specific function, nothing.
And that's what makes async code, not the blocking code, a leaky abstraction. Because abstraction, after all, is about distracting oneself from the irrelevant details.
That isn't my understanding of a leaky abstraction. An abstraction leaks when it's supposed to always behave in some certain, predictable way, but in practice, sometimes it doesn't. When does an async function not behave the way it's supposed to?
My understanding of a leaky abstraction is that the abstraction itself leaks out details of its design or underlying implementation and requires you to understand them. What you seem to describe is a bug, edge case, or maybe undefined behaviour?
For example, an ORM is a leaky abstraction over an RDBMS and SQL because you inevitably have to know details about your RDBMS and specific SQL dialect to work around shortcomings in the ORM, and also to understand how a query might perform (e.g will it be a join or an N+1?).
I don't really think that the async or blockingness is the leak, but that the time taken to process is not defined in either case, and you can leak failure criteria either way by not holding to that same time.
People can build to your async process finishing in 10ms, but if suddenly it takes 1s, it fails
It's better to think of "async" as indicating that a code will do something that blocks, and we're allowing our process to manage its blocking (via Futures) instead of the operating system (via a context switch mid-thread.)
I would argue a few things:
First: You need to be aware, in your program, of when you need to get data outside of your process. This is, fundamentally, a blocking operation. If your minor refactor / bugfix means that you need to add "async" a long way up the stack, does this mean that you goofed on assuming that some kind of routine could work only with data in RAM?
Instead: A non-async function should be something that you are confident will only work with the data that you have in RAM, or only perform CPU-bound operations. Any time you're writing a function that could get data from out of process, make it async.
I want to restate what you’ve said from a different perspective:
If you write a lot of pure functions, in a Functional Core, Imperative Shell manner, then only the imperative part has to deal with any of the async parts. Yes it’s the topmost part of the code, but there is nothing to “infect” with it which isn’t already destined to be.
It’s when you try to write imperative code like there are no consequences for doing so that the consequences show up en masse and can be confused as symptoms of entirely different problems.
In many application use cases, that is an implementation detail that should not be a concern of higher levels. The ramifications may not even be known by the people at higher levels.
Take something very common like cryptographic hashing, if you use something like node.js you really don't want to block the main thread calculating an advanced bcrypt hash. It also meets all of your requirements that data not come from outside ram and is very CPU bound.
Obviously, if you are directly calling this hashing algorithm you should know, however, the introduction of a need to hash is completely unpredictable.
> Take something very common like cryptographic hashing, if you use something like node.js you really don't want to block the main thread calculating an advanced bcrypt hash. It also meets all of your requirements that data not come from outside ram and is very CPU bound.
I guess I implied quick operations when I said "CPU bound." (IE, calculating a square root, string manipulation, (de)serialization...)
I haven't done hashing in Node.js. I assume that the bcrypt API is async and calls into a native library?
You could make the proposition that sequential code is inherently asynchronous in modern operating systems, because the kernel inherently abstracts the handling of blocking/unblocking your process.
Yes and no. From a practical point of view considering sync code as async is useless as sync and async code have very different usability characteristics and it is useful to have different names for the two domains.
On the other hand, form a more abstract, theoretical level, it is important to know that there is almost always an event loop at some point deep in the stack and sync code is just sugar over async and you can always transform from one to the other.
Yes and no, the concept that is being communicated is that your program inherently is asynchronous how-ever the context switching/block/unblocking is abstracted away from you so you can focus on what is important to you - which is solving your problem. Once you communicate that concept then it makes it a lot easier for people to create new mental models that better reflects reality. Then they can structure their programs that can take advantage of it.
I would say that's a leaky abstraction. (The OS hiding the details of async by blocking your threads and context switching, although this is how many programs operate.)
The problem comes when you have a critical section where you need to hold a mutex (lock): If you're working with "proper" async code, you can safely call any non-async code from within the lock. (C# enforces this, because a lock statement can not contain the await keyword.)
When any method you call can block on IO, synchronous programming (IE, using the OS to become async), can leak and make you hold your mutex longer than you should.
---
That being said, many of us build our careers programming threaded code that relies on the OS to block. So at this point we're splitting hairs.
When most people talk about async code it is with single threaded (hence the need not to block the event executor) applications (web-browsers, nodejs ect.). How-ever on the other end of the delirium when people talk about mutex's (locks) they're talking about multiple threadeds working in parallel which you inherently need to consider if the code you're calling is going to either direct or indirect acquire the same lock or like you pointed out calls a api that blocks. In the single threaded environment the only cognitive load you need to worry about is one or two yield points, how-ever in a parallel system your main concern is mutex's, race-conditions, or dead-locks.
Async only solves one of these two cases.
I’d like to draw an analogy here to ⊥ “bottom” in Haskell. It’s used to represent a computation that does not return a value. Why doesn’t it return a value? Maybe because it throws an exception (and bubbles up the stack), or maybe because it’s in an infinite loop, or maybe it’s just in a very long computation that doesn’t terminate by the time the user gets frustrated and interrupts the program. From a certain perspective, sometimes you don’t care why ⊥ doesn’t return, you just care that it doesn’t return.
Same is often true for blocking calls. You often don’t care whether a call is slow because of I/O or whether it is slow because of a long-running computation. Often, you just care whether it is slow or how slow it is.
(And obviously, sometimes you do care about the difference. I just think that the “blocking code is a leaky abstraction” is irreparably faulty, as an argument.)