Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ordsets have O(log n) time insert, remove, and lookup operations. They are definitely not lists, they must be trees.


> Ordsets have O(log n) time insert, remove, and lookup operations. They are definitely not lists, they must be trees.

To be specific, we were talking about Erlang's ordsets module. You can see it constructed as a list: https://github.com/erlang/otp/blob/d6285b0a347b9489ce939511e...

    new() -> [].
Then notice the function to add an element https://github.com/erlang/otp/blob/d6285b0a347b9489ce939511e.... It's a recursive traversal of the list to find the right place to add it.


I am surprised that this works with a list. An array list would require moving data to insert at the right place, and a linked list would make it uneasy to get to the nth element in O(1).


Erlang's list are immutable cons cells (same idea as in LISP). This cons cell is just the element plus the "tail" pointer to the rest of the list. It is written as [H | T]. The last cons cell's tail is an empty list. So a list L = [1,2] is also the same as [1 | [2]] or [1 | [2 | []].

Underneath it is implemented like a linked list with nodes pointing to an element and then a pointer to the next. But explicit pointer manipulation, breaking and stitching lists together like one would do in C is not allowed. So it is all about traversing the list then inserting at the right place.

The reason ordsets are implemented using just lists is for simplicity and because they are designed for a small collections. Say maybe keeping some preferences around, or a list of options. Not storing 1M entries.

But there are other data structures and facilities to handle larger collections like gb_trees which are implemented as AVL trees. Another is an in-memory table called ETS. This one can be shared between processes (though terms are actually copied in most cases when reading and writing to/from it).


The thing is that linked lists can only insert sorted in O(n).


They are most definitely sorted lists. This is mentioned very clearly in the documentation: http://erlang.org/doc/man/ordsets.html

>An ordset is a representation of a set, where an ordered list is used to store the elements of the set.


That does not prevent a tree-based implementation, though.


It kind of does. And there is also no need to argue about this, the code is available and very straight forward: https://github.com/erlang/otp/blob/fe2b1323a3866ed0a9712e9d1...


They don't have O(log n) insert as implemented in Erlang. It's a linear scan to find the place to insert.


It appears you don't understand sorted lists if you think a tree structure is required for those asymptotics.


Erlang's ordsets() module has O(N) insertion. Since there's theoretical disagreement, here's experimental verification:

  11> f(), Ord_bench = fun(N) -> Random = [random:uniform(N) || _ <- lists:seq(1,N)], F = fun(Item, Set) -> ordsets:add_element(Item, Set) end, lists:foldl(F, ordsets:new(), Random) end.
#Fun<erl_eval.6.99386804> 12> timer:tc(fun() -> Ord_bench(10000) end). {337691, [1,2,3,4,5,7,8,10,12,13,14,15,16,17,18,19,21,23,24,26,27,28, 29,36,39,40,42|...]} 13> timer:tc(fun() -> Ord_bench(100000) end). {37167977, [1,2,3,5,7,9,10,11,12,15,16,17,18,19,20,21,23,24,25,26,27, 29,30,33,34,35,38|...]}

So, 337 ms turns into 37 seconds, roughly 100 times slower, when I increase the data size tenfold, which is what you'd expect from O(N) insertion.

Repeating for gb_sets, the results are 49 ms versus 484 ms.

Could the misunderstanding be that you don't realise that the ordsets module specifically guarantees that the representation is an ordinary sorted list?

(I am aware of e.g. skip lists. But that is not the same Erlang data structure as a sorted list.)


No, the misunderstanding is that I was thinking of Ordsets in Rust [1], which are implemented as trees.

[1] https://docs.rs/im/10.0.0/im/#sets


I would probably use the word array if you can access any item without traversal. Just a thought.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: