Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> the Rust language, for all of its merits, does not make life easy for anybody wanting to implement a linked list, a switch to that language would not automatically solve the problem

There's a lot of truth to this, but there's also more to the story:

1) Writing a doubly-linked list in safe Rust is famously difficult and cumbersome, yes. But writing a singly-linked list in safe Rust isn't so bad. Once you're familiar with Box and Option and how references work, you can make a singly-linked list out of those the way you'd expect. They're like unique_ptr<T> and optional<T> in C++. It's too much for a day-one exercise, which is part of why we often warn beginners away from it, but with some guidance it can definitely be a week-one exercise.

2) On the other hand, writing a singly- or doubly-linked list in unsafe Rust is hardly any different than writing it in C. That's the whole idea of unsafe Rust. You get your hands on some raw pointers and you go to town. Now, doing a good, idiomatic job of that means exposing a safe interface, and understanding the interactions between safe and unsafe Rust well enough to do that correctly takes a lot more experience. But getting an unsafe doubly-linked list to work in Rust as a proof of concept is arguably easy. It would be malpractice to teach it to beginners, but you could. [Edit: I was wrong about this. The simple version I was imagining failed Miri.]

3) As the article points out, "C lends itself to the creation of ad hoc linked lists in every situation where they are needed, resulting in boilerplate code and duplicated definitions." Rust does not lend itself to this, however, because like C++ it supports generic data structures. So even if implementing a linked list in Rust was the most difficult programming challenge the world had ever seen, that wouldn't be much of a problem in practice, because we could get some genius to implement it once, and then the rest of us could use it through a safe, generic API.



Writing unsafe Rust is actually much harder than writing C or C++. C and C++ compilers apply certain optimizations quite conservatively (like anything involving aliasing) because the language doesn't assume a bunch of things like two mutable references never alias. Rust is a lot more aggressive.

If I was asked to rank them in terms of number of things you need to know to write fully correct (no UB) code, I would have: safe Rust < C++ < unsafe Rust

Here's a nice blog post that really resonated with me: https://lucumr.pocoo.org/2022/1/30/unsafe-rust/


Ok I was going to try to be pedantic and say that if you could only use unsafe pointer operations, the aliasing issues don't apply. But the first example I tried to put together failed Miri, cause I was creating temporary &mut references to cast to *mut. So yeah...egg on face. Edited.


Does Rust apply aliasing optimisations now? Last I heard of that, they kept having to turn it back off because of LLVM bugs concerning aliasing.


Rust (and, for that matter, C++ compilers) have always applied optimizations that rely on alias analysis. What I assume this thread is referring to is the noalias LLVM attribute on &mut function parameters. Rust now applies this.


In any case, it's still UB under the language guidelines, it's just not exploited yet.


I believe since the last time they were turned on they have stayed on, yes.


Rust does not use type-based alias analysis (the "strict aliasing" problem), however.


Writing a doubly linked list is trivial if you use a backing vector and view into it. Which is how you should do it.

Linked lists are awful data structures and just as difficult to implement soundly in a thread safe context everywhere else. Rust just yells at you for doing it wrong.

I reiterate, unless you have a damn good reason, a linked list is an awful data structure. They're slow and cumbersome and have no advantages at small or large scales. The cases where you want them are few and far between and not in any undergraduate DSA course or application.


Yes, linked lists are overused in userspace, especially by beginners, but the "linked list bad" meme has gone too far. In particular, intrusive linked lists are commonly used in game engines and, oh yeah, operating system kernels for reasons ranging from memory efficiency to the ability to cheaply rearrange nodes.

> Linked lists are awful data structures and just as difficult to implement soundly in a thread safe context everywhere else. Rust just yells at you for doing it wrong.

This is a non-sequitur. Vectors and vector-based data structures are just as thread unsafe as basic linked lists. Furthermore, Rust isn't yelling at you for "doing it wrong". Safe Rust is yelling at you because the type system (by design) does not like cyclic ownership.


This is exactly the mindset that people hate about Rust, although I think it's unfair to associate it with the language because it's perpetuated by people who hold opinions like yours.

Linked lists are often not very performant on modern hardware. There's a lot of usecases where a vector (just as a generic collection or as the backing view) would be a better choice. Rust happens to make creating linked lists non-trivial. These are all facts.

The incorrect way to interpret this is how you've done: that linked lists are just wrong, and Rust happens to not let you use them because it wants you to not write wrong code so it makes this hard. Rust is a good language, but this is just a limitation of Rust, full stop. It's not some secret "the language authors are super smart and want you to stop writing linked lists". It's just that the language doesn't really let you do this, and it so happens that you can often get by with not creating them so Rust remains a productive language most of the time.

Linked lists remain incredibly useful in certain situations, of which an operating system kernel happens to be to feature a lot of. The ability to reliably allocate contiguous memory is not a given, and various thread safety and reference identity requirements make lists the only choice for a lot of parts of the kernel. This would be true regardless of the language you used to write it. Suggestions that linked lists are awful and have no advantages are quite frankly more of a sign of your limited experience with the environments where they would be useful than them being "wrong".

(Do note that I am not saying that the kernel only uses linked lists where appropriate: there's a lot of places where they're just a crutch to get around C being unable to deal with generic types without indirecting them behind a pointer, and Rust or even C++ would definitely be able to remove some of these "unfortunate" uses. But there's still plenty of linked lists that can not and will not be replaced regardless of the language the kernel is written in.)


Actually Rust lets you write linked lists of any kind very easily with reference counting. It's just that reference counting is no fun in C land. People like running around with loaded guns with disabled safety in their pocket.

Linked lists have a lot of pointers, that means a lot of things can go wrong. It's just that the mental strain of lots of things going wrong is too much for most humans so they just straight up ignore the safety aspects.

When you think about it, the ownership hierarchy of a linked list is quite complex. A lot of rust developers simply solve the problem by having a linked list container that owns all nodes. But in C land, it is pretty normal that the first node owns the second, the second owns the third, the third owns the fourth and so on. The ownership hierarchy is much more complicated, for not much benefit. Once you have a doubly linked list, ownership is no longer trivial because it cannot be expressed as a hierarchy.

Most languages have solved this with a garbage collector so it's really Rust that is the first language that is attempting a solution without a GC. In C land, the danger is part of the job, not something that anyone particularly cares about.


If your argument is "linked list memory management is rocket science so nobody should ever do it" then please stop. You're only digging yourself deeper.

No serious linked list implementation in Rust uses Rc<Refcell<_>>. They all use either unsafe or a vector of nodes because linked lists are among the most studied data structures in all of computer science and are extremely well-understood by skilled practitioners.

The people who wrote C and Pascal for decades didn't "straight up ignore the safety aspects." This is pure projection and only makes the Rust community (which I consider myself a part of) look bad.


I beg to disagree.

An old presentation by Simon Peyton-Jones (of GHC fame) has a line like that: "Linked list implementation is first year programming course task; linked list implementation on multiprocessor is a PhD-level task; linked list implementation on multiprocessor using software transactional memory is again first year course task."

SPJ actually teaches students and oversees PhD students so these are not vain words.

STM greatly simplifies everything that is done in parallel, including, but not limited to, double linked lists.

Rust "yells at you doing it wrong" because Rust does not know any better way. Effects would help expressing safe STM computations, and they are just coming into Rust, hesitantly.

Let's see what will happen.


That reminds me of this article[1] by Bryan Cantrill. One takeaway is that Rust's ability to easily import generic data structures makes the average Rust program faster than the average C program.

With a bit of effort, C can be made faster than Rust, but if I'm writing a simple utility in C that needs a linked list, I'm going to write the simplest possible implementation, straight from an algorithms textbook. It's better for it to be slower, but more understandable and maintainable.

With Rust, the implementation I import has far more work put into it. I don't need to worry about how complex it is though, since its part of the standard library and I trust that people smarter than me have checked over what it does.

[1] http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performa...


> 1) Writing a doubly-linked list in safe Rust is famously difficult and cumbersome, yes. But writing a singly-linked list in safe Rust isn't so bad.

Having never even seen a Rust program, let alone written one, if basic data structures are "famously difficult", ....


I totally get what you mean. It is surprising. And because it's surprising, a lot of beginners dive right into it without realizing what they're getting into, and many wind up frustrated. It can feel like Rust is throwing up a bunch of pointless roadblocks at you for no reason. But here's how I'd summarize the reasoning behind it:

In most languages, there are few or no restrictions on which objects are allowed to point to which other objects. In garbage collected languages (Java/Python/JS/Ruby/Go/etc), this is because the language runtime keeps objects alive for you as long as you hold pointers to them. In not-memory-safe native languages (C/C++/Zig/etc), this is because the language holds you responsible for avoiding use-after-free and other issues. These two groups of languages are quite different from each other, but one thing they have in common is that you'll never see an error message that says "Object A isn't allowed to point to object B."

This is one of the things that makes Rust different from both of groups above. Rust is not garbage collected, but it is memory-safe. To make this work, the Rust compiler has strict rules around when you're allowed to retain a pointer to some object and when you're not. These rules have a substantial learning curve, partly because they can be tricky, but also because they're new and unfamiliar to most programmers.

So the reason linked lists are more difficult to write in Rust is that they're pointer-heavy data structures. Implementing a linked list runs you straight into the pointer rules that make Rust different and sometimes tricky. To make everything work, you either need to spend some time studying those rules first, or you need a teacher who can point you in the right direction and answer your questions as you go.

Anyway, if you're curious to learn more, there are several excellent books available, including an official one that we often just call The Book: https://doc.rust-lang.org/book/


I know why lists are hard to do with Rust's guarantees, and can even understand why a given memory safety validator might have trouble with them.

My point is that I expect a language eco-system to make it possible to do "easy" things without much trouble.

As the saying goes, a ship is safe in harbor, but that's not what ships are for.

Safety is a desirable property but it is not why (most of us) write programs. (I'm reminded of early Haskell, where programs could be guaranteed to produce the correct output but no guarantee that they'd produce output.)


> I expect a language eco-system to make it possible to do "easy" things without much trouble.

I think a lot depends on what we mean by "trouble". For example, here's a basic implementation of an unsafe doubly-linked list with a safe API: https://play.rust-lang.org/?version=stable&mode=debug&editio...

There are only a few things in there that jump out to me as inconvenient:

- We have to write `unsafe` blocks in several places.

- When we dereference raw pointers, we have to write things like `(*self.head).next`, with more parentheses and stars than usual.

- We need to add PhantomData<T> members to our iterator types, so that the Rust compiler knows whether their lifetimes are covariant vs invariant.

None of those inconveniences adds very much in terms of lines of code. And of course, some of them are inconvenient by choice. Rust wants to be extra clear when we're doing unsafe things like dereferencing a raw pointer.

However, there are several places in this example where knowing the right thing to do is a serious challenge:

- The PhantomData<T> thing from above counts as one of these too, although a compiler message will point you to it if you forget.

- We need to store raw pointers to nodes rather than shared or mutable references to nodes.

- We allocate memory with Box::leak(Box::new(...)).

- We free memory with Box::from_raw(...) and end-of-scope destruction.

- We need to add an explicit Drop implementation to avoid leaking memory.

All of those are intermediate or advanced Rust concepts, and it doesn't make sense to try to teach them to beginners. It's also easy to get them wrong, much like malloc() and free() and constructors and destructors are easy to get wrong in C and C++. But once you understand these concepts and know how to use them, I wouldn't call them inconvenient per se. It's not clear to me that a helper library from the ecosystem would be able to help much here. Maybe we could find a way to reduce repitition in the implementations of the shared and mutable iterators? But for the most part, the challenge here is correctness, not repetition.

Of course, what the ecosystem can do for us is provide a ready-made implementation of the whole container, so that we don't have to do any of this ourselves: https://doc.rust-lang.org/std/collections/struct.LinkedList....


From what I hear, making "intrusive linked lists which lend out &mut" sound is an unsolved problem: https://gist.github.com/Darksonn/1567538f56af1a8038ecc3c664a...


Oh yeah for sure. Even if you make sure your lists don't have any cycles, it's also possible for a given object to be on multiple different lists. Sometimes that's the whole point of the intrusive approach. But if you're accessible from multiple lists, there's no way a list can meet the uniqueness requirements of &mut. I guess you have two options:

1. Make the whole list API unsafe, so that the only way to expose a safe API on top of it is to make the list an internal implementation detail.

2. Take the Rc<RefCell<T>> or Arc<Mutex<T>> approach, where each element effectively has its own lock. Of course in high-performance code this is usually too expensive, and in any code calling .borrow() or .lock() all over the place is annoying.


Standard library that takes care of all of the hassle can be good enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: