Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I am curious as to what a principled definition of a "blocking" function would be.

It's one where the OS puts your (kernel) thread/task to sleep and then (probably) context switches to another thread/task (possibly of another process), before eventually resuming execution of yours after some condition has been satisfied (could be I/O of some time, could be a deliberate wait, could be several other things).



OS threads can be put to sleep for many reasons, or they can be preempted for no explicit reason at all.

On such reason could be accessing memory that is currently not paged in. This could be memory in the rext section, memory mapped from the executable you are running.

I doubt that you would want to include such context switches in your "blocking" definition, as it makes every function blocking, rendering the taxonomy useless.


That seems a necessary but not sufficient condition, since a pre-emptively multitasking OS may do this after almost any instruction.

Not only that, but any OS with virtual memory will almost certainly context-switch on a hard page fault (and perhaps even on a soft page fault, I don't know). So it would seem that teading memory is sufficient to be "blocking", by your criterion.


1) I deliberately left it open to including page faults by design. If you do not understand that reading memory might lead to your process blocking, you have much to learn as a programmer. However, I would concede that that this is not really part of the normal meaning of "blocking", despite the result being very similar.

2) I did not intend to include preemption. I should reword it to make it clear that the "condition" must be something of the process' own making (e.g. reading from a file or a socket) rather than simply an arbitrary decision by the kernel.


Would signaling, say, a semaphore count? That could cause another higher priority realtime process to wake-up and block the signaler forever.

I think the canonical meaning of "blocking code" is more of the "I know it when I see it" variety than an useful and rigorous definition.


This feels mostly right to me. I think that you get into interesting things in the margins (is a memory read blocking? No, except when it is because it's reading from a memory-mapped file!) that make this definition not 100% production ready.

But ultimately if everyone in the stack agrees enough on a definition of blocking, then you can apply annotations and have those propagate.


> except when it is because it's reading from a memory-mapped file

Where "memory mapped file" includes your program executable. Or any memory if you have swap space available.

And any operations can be "blocking" if your thread is preempted which can happen at basically any point.

So yes, everything is blocking. It is just shades of grey.


But this isn’t hemming to the definition brought up by GP. “I will now, of my own accord, sleep so the OS scheduler takes over” is fairly precise. And it’s different from both just doing an operation that takes time… and different from the program doing an operation that, incidentally, the OS sees and then forces a sleep due to some OS-internal abstraction

But you think about this too much and you can easily get to “well now none of my code is blocking”, because of how stuff is implemented. Or, more precisely, “exactly one function blocks, and that function is select()” or something.


For reference, there are 2 lists of such functions in signal(7).

The first list is for syscalls that obey `SA_RESTART` - generally, these are operations on single objects - read, lock, etc.

The second list is for syscalls that don't restart - generally, these look for an event on any of a set of objects - pause, select, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: