> the Rust language, for all of its merits, does not make life easy for anybody ...

kccqzy · on March 11, 2022

Writing unsafe Rust is actually much harder than writing C or C++. C and C++ compilers apply certain optimizations quite conservatively (like anything involving aliasing) because the language doesn't assume a bunch of things like two mutable references never alias. Rust is a lot more aggressive.

If I was asked to rank them in terms of number of things you need to know to write fully correct (no UB) code, I would have: safe Rust < C++ < unsafe Rust

Here's a nice blog post that really resonated with me: https://lucumr.pocoo.org/2022/1/30/unsafe-rust/

oconnor663 · on March 11, 2022

Ok I was going to try to be pedantic and say that if you could only use unsafe pointer operations, the aliasing issues don't apply. But the first example I tried to put together failed Miri, cause I was creating temporary &mut references to cast to *mut. So yeah...egg on face. Edited.

Filligree · on March 11, 2022

Does Rust apply aliasing optimisations now? Last I heard of that, they kept having to turn it back off because of LLVM bugs concerning aliasing.

pcwalton · on March 11, 2022

Rust (and, for that matter, C++ compilers) have always applied optimizations that rely on alias analysis. What I assume this thread is referring to is the noalias LLVM attribute on &mut function parameters. Rust now applies this.

coolreader18 · on March 11, 2022

In any case, it's still UB under the language guidelines, it's just not exploited yet.

oconnor663 · on March 11, 2022

I believe since the last time they were turned on they have stayed on, yes.

kzrdude · on March 16, 2022

Rust does not use type-based alias analysis (the "strict aliasing" problem), however.

duped · on March 11, 2022

Writing a doubly linked list is trivial if you use a backing vector and view into it. Which is how you should do it.

Linked lists are awful data structures and just as difficult to implement soundly in a thread safe context everywhere else. Rust just yells at you for doing it wrong.

I reiterate, unless you have a damn good reason, a linked list is an awful data structure. They're slow and cumbersome and have no advantages at small or large scales. The cases where you want them are few and far between and not in any undergraduate DSA course or application.

platinumrad · on March 11, 2022

Yes, linked lists are overused in userspace, especially by beginners, but the "linked list bad" meme has gone too far. In particular, intrusive linked lists are commonly used in game engines and, oh yeah, operating system kernels for reasons ranging from memory efficiency to the ability to cheaply rearrange nodes.

> Linked lists are awful data structures and just as difficult to implement soundly in a thread safe context everywhere else. Rust just yells at you for doing it wrong.

This is a non-sequitur. Vectors and vector-based data structures are just as thread unsafe as basic linked lists. Furthermore, Rust isn't yelling at you for "doing it wrong". Safe Rust is yelling at you because the type system (by design) does not like cyclic ownership.

saagarjha · on March 11, 2022

This is exactly the mindset that people hate about Rust, although I think it's unfair to associate it with the language because it's perpetuated by people who hold opinions like yours.

Linked lists are often not very performant on modern hardware. There's a lot of usecases where a vector (just as a generic collection or as the backing view) would be a better choice. Rust happens to make creating linked lists non-trivial. These are all facts.

The incorrect way to interpret this is how you've done: that linked lists are just wrong, and Rust happens to not let you use them because it wants you to not write wrong code so it makes this hard. Rust is a good language, but this is just a limitation of Rust, full stop. It's not some secret "the language authors are super smart and want you to stop writing linked lists". It's just that the language doesn't really let you do this, and it so happens that you can often get by with not creating them so Rust remains a productive language most of the time.

Linked lists remain incredibly useful in certain situations, of which an operating system kernel happens to be to feature a lot of. The ability to reliably allocate contiguous memory is not a given, and various thread safety and reference identity requirements make lists the only choice for a lot of parts of the kernel. This would be true regardless of the language you used to write it. Suggestions that linked lists are awful and have no advantages are quite frankly more of a sign of your limited experience with the environments where they would be useful than them being "wrong".

(Do note that I am not saying that the kernel only uses linked lists where appropriate: there's a lot of places where they're just a crutch to get around C being unable to deal with generic types without indirecting them behind a pointer, and Rust or even C++ would definitely be able to remove some of these "unfortunate" uses. But there's still plenty of linked lists that can not and will not be replaced regardless of the language the kernel is written in.)

imtringued · on March 11, 2022

Actually Rust lets you write linked lists of any kind very easily with reference counting. It's just that reference counting is no fun in C land. People like running around with loaded guns with disabled safety in their pocket.

Linked lists have a lot of pointers, that means a lot of things can go wrong. It's just that the mental strain of lots of things going wrong is too much for most humans so they just straight up ignore the safety aspects.

When you think about it, the ownership hierarchy of a linked list is quite complex. A lot of rust developers simply solve the problem by having a linked list container that owns all nodes. But in C land, it is pretty normal that the first node owns the second, the second owns the third, the third owns the fourth and so on. The ownership hierarchy is much more complicated, for not much benefit. Once you have a doubly linked list, ownership is no longer trivial because it cannot be expressed as a hierarchy.

Most languages have solved this with a garbage collector so it's really Rust that is the first language that is attempting a solution without a GC. In C land, the danger is part of the job, not something that anyone particularly cares about.

platinumrad · on March 11, 2022

If your argument is "linked list memory management is rocket science so nobody should ever do it" then please stop. You're only digging yourself deeper.

No serious linked list implementation in Rust uses Rc<Refcell<_>>. They all use either unsafe or a vector of nodes because linked lists are among the most studied data structures in all of computer science and are extremely well-understood by skilled practitioners.

The people who wrote C and Pascal for decades didn't "straight up ignore the safety aspects." This is pure projection and only makes the Rust community (which I consider myself a part of) look bad.

thesz · on March 11, 2022

I beg to disagree.

An old presentation by Simon Peyton-Jones (of GHC fame) has a line like that: "Linked list implementation is first year programming course task; linked list implementation on multiprocessor is a PhD-level task; linked list implementation on multiprocessor using software transactional memory is again first year course task."

SPJ actually teaches students and oversees PhD students so these are not vain words.

STM greatly simplifies everything that is done in parallel, including, but not limited to, double linked lists.

Rust "yells at you doing it wrong" because Rust does not know any better way. Effects would help expressing safe STM computations, and they are just coming into Rust, hesitantly.

Let's see what will happen.

DenseComet · on March 11, 2022

That reminds me of this article[1] by Bryan Cantrill. One takeaway is that Rust's ability to easily import generic data structures makes the average Rust program faster than the average C program.

With a bit of effort, C can be made faster than Rust, but if I'm writing a simple utility in C that needs a linked list, I'm going to write the simplest possible implementation, straight from an algorithms textbook. It's better for it to be slower, but more understandable and maintainable.

With Rust, the implementation I import has far more work put into it. I don't need to worry about how complex it is though, since its part of the standard library and I trust that people smarter than me have checked over what it does.

[1] http://dtrace.org/blogs/bmc/2018/09/28/the-relative-performa...

anamax · on March 11, 2022

> 1) Writing a doubly-linked list in safe Rust is famously difficult and cumbersome, yes. But writing a singly-linked list in safe Rust isn't so bad.

Having never even seen a Rust program, let alone written one, if basic data structures are "famously difficult", ....

oconnor663 · on March 11, 2022

I totally get what you mean. It is surprising. And because it's surprising, a lot of beginners dive right into it without realizing what they're getting into, and many wind up frustrated. It can feel like Rust is throwing up a bunch of pointless roadblocks at you for no reason. But here's how I'd summarize the reasoning behind it:

In most languages, there are few or no restrictions on which objects are allowed to point to which other objects. In garbage collected languages (Java/Python/JS/Ruby/Go/etc), this is because the language runtime keeps objects alive for you as long as you hold pointers to them. In not-memory-safe native languages (C/C++/Zig/etc), this is because the language holds you responsible for avoiding use-after-free and other issues. These two groups of languages are quite different from each other, but one thing they have in common is that you'll never see an error message that says "Object A isn't allowed to point to object B."

This is one of the things that makes Rust different from both of groups above. Rust is not garbage collected, but it is memory-safe. To make this work, the Rust compiler has strict rules around when you're allowed to retain a pointer to some object and when you're not. These rules have a substantial learning curve, partly because they can be tricky, but also because they're new and unfamiliar to most programmers.

So the reason linked lists are more difficult to write in Rust is that they're pointer-heavy data structures. Implementing a linked list runs you straight into the pointer rules that make Rust different and sometimes tricky. To make everything work, you either need to spend some time studying those rules first, or you need a teacher who can point you in the right direction and answer your questions as you go.

Anyway, if you're curious to learn more, there are several excellent books available, including an official one that we often just call The Book: https://doc.rust-lang.org/book/

anamax · on March 11, 2022

I know why lists are hard to do with Rust's guarantees, and can even understand why a given memory safety validator might have trouble with them.

My point is that I expect a language eco-system to make it possible to do "easy" things without much trouble.

As the saying goes, a ship is safe in harbor, but that's not what ships are for.

Safety is a desirable property but it is not why (most of us) write programs. (I'm reminded of early Haskell, where programs could be guaranteed to produce the correct output but no guarantee that they'd produce output.)

oconnor663 · on March 11, 2022

> I expect a language eco-system to make it possible to do "easy" things without much trouble.

I think a lot depends on what we mean by "trouble". For example, here's a basic implementation of an unsafe doubly-linked list with a safe API: https://play.rust-lang.org/?version=stable&mode=debug&editio...

There are only a few things in there that jump out to me as inconvenient:

- We have to write `unsafe` blocks in several places.

- When we dereference raw pointers, we have to write things like `(*self.head).next`, with more parentheses and stars than usual.

- We need to add PhantomData<T> members to our iterator types, so that the Rust compiler knows whether their lifetimes are covariant vs invariant.

None of those inconveniences adds very much in terms of lines of code. And of course, some of them are inconvenient by choice. Rust wants to be extra clear when we're doing unsafe things like dereferencing a raw pointer.

However, there are several places in this example where knowing the right thing to do is a serious challenge:

- The PhantomData<T> thing from above counts as one of these too, although a compiler message will point you to it if you forget.

- We need to store raw pointers to nodes rather than shared or mutable references to nodes.

- We allocate memory with Box::leak(Box::new(...)).

- We free memory with Box::from_raw(...) and end-of-scope destruction.

- We need to add an explicit Drop implementation to avoid leaking memory.

All of those are intermediate or advanced Rust concepts, and it doesn't make sense to try to teach them to beginners. It's also easy to get them wrong, much like malloc() and free() and constructors and destructors are easy to get wrong in C and C++. But once you understand these concepts and know how to use them, I wouldn't call them inconvenient per se. It's not clear to me that a helper library from the ecosystem would be able to help much here. Maybe we could find a way to reduce repitition in the implementations of the shared and mutable iterators? But for the most part, the challenge here is correctness, not repetition.

Of course, what the ecosystem can do for us is provide a ready-made implementation of the whole container, so that we don't have to do any of this ourselves: https://doc.rust-lang.org/std/collections/struct.LinkedList....

nyanpasu64 · on March 12, 2022

From what I hear, making "intrusive linked lists which lend out &mut" sound is an unsolved problem: https://gist.github.com/Darksonn/1567538f56af1a8038ecc3c664a...

oconnor663 · on March 12, 2022

Oh yeah for sure. Even if you make sure your lists don't have any cycles, it's also possible for a given object to be on multiple different lists. Sometimes that's the whole point of the intrusive approach. But if you're accessible from multiple lists, there's no way a list can meet the uniqueness requirements of &mut. I guess you have two options:

1. Make the whole list API unsafe, so that the only way to expose a safe API on top of it is to make the list an internal implementation detail.

2. Take the Rc<RefCell<T>> or Arc<Mutex<T>> approach, where each element effectively has its own lock. Of course in high-performance code this is usually too expensive, and in any code calling .borrow() or .lock() all over the place is annoying.

anamax · on March 12, 2022

Standard library that takes care of all of the hassle can be good enough.