In a practical working environment it creates more problems than it solves. It's...

dan-robertson · on March 16, 2021

I think I mostly agree. When TCE isn’t guaranteed, it seems pretty bad to have code that runs fine in one engine but blows up the stack in another. When it is guaranteed it’s still hard because modifications to code may break TCE and so as well as knowing what those things are (eg wrapping in a try block), you need to be careful about every change that breaks TCE to make sure that the code isn’t relying on TCE. Also I think JS might have extra problems due to eg arguments.caller.

In languages which do have guaranteed TCE like scheme or ML variants, I think I would rather see some way to explicitly say “this is a tail call which should be eliminated” or “all recursive calls to this function should be tail calls” and then have the compiler complain if you have a call which should be eliminated but isn’t in tail position. I think the reason for clojure not doing TCE is that, in general[1], it is a global optimisation (not possible by simple syntax transformation) that the jvm doesn’t do. So clojure didn’t really have a choice in whether or not to have TCE.

[1] it is more possible to transform recursive or mutually recursive code into iterative code, but the latter becomes hard in the face of a dynamic language where functions can be redefined and impossible to do locally for all tail calls.

swiley · on March 16, 2021

I feel like tail calls are pretty obvious, the c++ grammar even explicitly calls them out.

qsort · on March 16, 2021

Eh, kinda. They are obvious modulo a lot of things, such as what definition of tail call your compiler is using and what optimizations it performs.

Your comment piqued my curiosity and I made a little experiment :)

The following function:

   int rec(int n) {
       if (n < 1) return 0;
       return 1 + rec(n - 1);
   }

will get TCOd by gcc but not by clang (!!)

I don't know, probably I'm not smart enough for TCO, but I just don't want to work like that. It's a minefield.

CalChris · on March 16, 2021

However, for -O3 on ARMv8, LLVM generates pretty good if slightly inscrutable code [1]:

  rec:                                         // @rec
    bic     w0, w0, w0, asr #31
    ret

That's taking the sign bit, replicating it to 32b and then using it as a not'd mask.

  negative numbers:
     w0, asr #31     => -1
     bic w0, w0, -1  => and w0, w0, 0          => ret 0

  positive numbers:
     w0, asr #31     => 0
     bic w0, w0, 0   => and w0, w0, 0xffffffff => ret w0

gcc is also pretty good [2]:

  rec:
        cmp     w0, 0
        csel    w0, w0, wzr, gt
        ret

[1] https://godbolt.org/z/q9hM7z

[2] https://godbolt.org/z/6rbE5e

Hitton · on March 16, 2021

It doesn't get TCOd, the compiler simply recognizes what it does and optimizes it. It does literally same thing with the code below.

  int count(int n) {
      int res = 0;
      for (int i=0; i < n; i++) res++;
      return res;
  }

maverwa · on March 16, 2021

Would you mind sharing the compiler versions and flags used? I tried to look at it in godbolt, but gcc 10.2 -O and clang 11.0 -O look pretty much the same. On the other hand, I know pretty little about C and its compilers ;)

qsort · on March 16, 2021

clang 11.0, gcc 7.4, O2. I did that with whatever was already installed on my machines, YMMV.

yvdriess · on March 16, 2021

That's not a tail recursive call.

qsort · on March 16, 2021

That's precisely the point.

yvdriess · on March 16, 2021

I'm confused, how are you testing TCO by giving the compiler a non-tail call?

rbanffy · on March 16, 2021

This is a magic vs. no-magic thing. TCO is a magical thing that implicitly changes the behavior of code with the same syntax. The code says the stack should blow up, but the compiler saves you from it.

I'll side with the Python teaching that explicit is better than implicit. If the code says the stack will blow, then the compiler/runtime should not try to avoid that.

yvdriess · on March 16, 2021

There's nothing magical about a language that says that a tail-call will reuse the same frame/record/instance. It's only magical and unreliable when it's not in the language spec.

leoc · on March 16, 2021

And a well-defined TCO guarantee would probably not rank very high on the already long list of things that make it ... a bit tricky to discern the performance of a function from looking at the text of the function body.

dTal · on March 17, 2021

It's not any more "magic" than garbage collection. The runtime figures out when memory is no longer required and frees it, through perfectly well-defined and easy-to-model rules. The code doesn't "say" the stack will blow because a finite stack isn't a language guarantee - it's simply an inconvenient fact of life.

_vntb · on March 16, 2021

Clojure does not optimize tail calls because JVM does not support it.

fulafel · on March 17, 2021

Kind of. This comment summed it up well: https://news.ycombinator.com/item?id=26298891

bear8642 · on March 16, 2021

Thought there's extension enabling JVM tail optimization?