This is really part of the intro, but I wonder if someone knows anyway: My very ...

Diggsey · on March 10, 2022

Nobody's mentioned what is IMO the main benefit of an intrusive data-structure: items within the data-structure can belong to multiple containers.

This allows you to locate an item using more than one criteria: for example, you could store blocks of free memory in two intrusive lists, one sorted by size and one by location. This would allow you to first locate eg. the largest memory block, and then quickly locate blocks adjacent to that one in memory.

With non-intrusive data-structures this is not possible, because while you can go from an iterator -> the item pointed to by the iterator, there is no way to efficiently do the reverse. This is possible in an intrusive data-structure because the iterator and the item itself are the same thing.

shadowofneptune · on March 11, 2022

So when you remove an element from the list, how do you make sure both lists are updated? Are elements even likely to be removed in this use-case?

adwn · on March 11, 2022

> So when you remove an element from the list, how do you make sure both lists are updated?

The remove_from_list() function has to be either payload-specific, or it needs a function pointer parameter which is called to handle the payload-specific unlinking. Assuming the lists are doubly-linked, which they typically are in the Linux kernel.

jjnoakes · on March 11, 2022

This is one of the purposes of doubly-linking the nodes.

Say your struct is in two lists. To remove it from both, first follow the next/prev pointers of the first list embedded struct to remove it from the first list, and then do the same thing with the second list embedded struct to remove it from the second list.

loeg · on March 10, 2022

Yes, a very important point. In the kernel, it is pretty common for objects to need to be on multiple different lists.

re · on March 10, 2022

This is called "internal" vs. "external" storage, or sometimes an "intrusive linked list". Fewer allocations and better locality of reference for faster access are some of the biggest benefits. There's more discussion here: https://en.wikipedia.org/wiki/Linked_list#Internal_and_exter...

loeg · on March 10, 2022

> Are there other benefits, though?

Yes; you're avoiding an extra pointer indirection between list navigation and element access, which is usually going to be a cache miss. Also if you have a pointer to an element of an intrusive list, you can cheaply navigate to the next element; this wouldn't work with your external list scheme (you'd need to maintain a separate list pointer instead).

Matthias247 · on March 10, 2022

Not only avoiding the cache miss. It's also avoiding an allocation. The non-intrusive list that the parent described means the list node requires one allocation, and the content another. Intrusive lists mean there's just one allocation required for the actual element - and hooking it into a list comes for free.

If the struct we want to insert is on the stack we don't even need a single allocation. That is for example be useful on wait nodes, where we put some waiter object on the stack, register it in a shared wait queue, and then wait until the object gets signalled => No allocation is required for any waiter.

loeg · on March 10, 2022

Yes. I would caution on inserting stack objects into shared lists, though, because it's such an easy footgun if the reference outlives the stack frame.

Matthias247 · on March 11, 2022

It's not really worse than heap allocated objects. For them you also need to make sure the pointer in the list does not outlive the actual object.

The only safe structures are heap allocated plus reference counted ones, where inserting something into a list also increments the refcount.

loeg · on March 11, 2022

I think you're completely correct, in a pedantic sense. However, heap object lifetimes aren't bound to control flow in quite the same way.

robotresearcher · on March 11, 2022

You could allocate a single novel struct containing the structure of interest and any number of list elements that will contain it in a single malloc().

If the lists don’t need to know about each other, each novel struct could be different.

This would work, but it’s a bit exotic, so I’m not recommending it. But it addresses the locality and multiple allocation issues.

Hello71 · on March 11, 2022

isn't that exactly equivalent to an intrusive linked list in memory except harder to implement?

adwn · on March 11, 2022

Not quite. With traditional intrusive linked lists, you'll have to modify the payload struct whenever you want to add it to another or different lists; with OP's approach, maybe not (but it depends on the implementation details).

pthread_t · on March 10, 2022

This was one of my first questions in my OS class in college. The answer was that the kernel implementation takes advantage of the cache.

dottedmag · on March 10, 2022

Cache locality and memory consumption.