(Or why goto is worth keeping around in modern languages.)
A cool thing happened during a lesson today, and I wanted to try to capture the magic before it slipped through my fingers. It happened while I was trying to teach recursion (without ever using that word) using my side project, Mu. The experience got me thinking about the quote above, and wondering if there was a way to bridge the summits of C and Lisp without having to go through the “swampy lowlands” between them.
But I'm getting ahead of myself. I've written about Mu before, but you don't need to remember any of that for this post. All you need to know is that it's basically a cleaned up reimplementation of C's memory model, with the ability to manipulate addresses in memory.
If you learned programming using C you probably learned to create a generic linked list something like this (using a sort of timeless mish-mash of C and C++ that is guaranteed to not compile):
struct list<T> { T value; list<T>* rest; }
In Mu you'd write this as:
container list:_T [ value:_T rest:address:list:_T # "an address to a list of _T" ]
To gradually introduce recursion I first had my student write a function to calculate the length of a list. In C you'd tend to do this using a loop:
int length(list<T>* x) { int result = 0; while (x) { ++result; x = x->rest; } return result; }
This was my student's initial approach as well in Mu:
def length x:address:list:_T → result:number [ result ← copy 0 { break-unless x # break if x is empty result ← add result 1 x ← rest x loop } return result ]
That translates pretty exactly from the idiomatic C. The only difference is syntax: Mu instructions look like Basic or Assembly: just sequences of instructions, no nested expressions or blocks. But they're doing the same (hopefully obvious) thing.
Now for a recursive version. C tends to teach recursion later and use it less frequently, so for a point of comparison we'll switch gears and imagine a counterfactual universe where we're learning programming using Lisp1 (again with some liberties taken to make it easier to read that guarantee it won't run anywhere in this universe):
; the length of a list is 1 more ; than the length of the rest of the list def length (x) if (empty? x) 0 (+ 1 (length rest.x))
My student's Mu solution (after some struggles):
# the length of a list is 1 more # than the length of the rest of the list def length x:address:list:_T → result:number [ return-unless x, 0 x ← rest x result ← length x result ← add 1, result ]
It's a little harder to see because Lisp exclusively uses expressions while Mu exclusively uses statements, but the two versions translate pretty exactly between one another. The only difference is that you need to explicitly provide types in Mu, just like in C.
Things got more interesting when we tried to translate another classic approach to recursion that every programmer should possess in their toolbox: using an accumulator. The above recursive version calculates the length "on the way out":
length [1, 2, 3] ⇒length [2, 3] ⇒length [3] ⇒length [] 0 ⇐ 1 ⇐ 2 ⇐ 3 ⇐
To calculate it "on the way in", we'll use an extra argument to accumulate the result:
length [1, 2, 3], 0 ⇒length [2, 3], 1 ⇒length [3], 2 ⇒length [], 3 3 ⇐ 3 ⇐ 3 ⇐ 3 ⇐
You'd tend to write it like this in Lisp:
def length (x total) if (empty? x) total (length rest.x (+ total 1))
My student wrote it like this in Mu:
def length x:address:list:_T, total:number → total:number [ return-unless x, total x ← rest x total ← add total, 1 total ← length x, total ]
Again, a pretty exact translation. But I prefer the Mu version because it's obvious that it's tail-recursive. With Lisp it's always been non-trivial to know when a function is tail-recursive. You can't just blindly count parens, you have to try to "run" the function in your mind. In Mu you can just check the final instruction.
This benefit becomes even more apparent when you try to perform tail-recursion elimination. All you have to do is replace the recursive call with a jump (goto) to the start of the function:
def length x:address:list:_T, total:number → total:number [ +start return-unless x, total x ← rest x total ← add total, 1 jump +start total ← length x, total ]
That was a pretty neat insight for my student, and it happened purely by accident. It was startling how easy the translation from tail-recursive to iterative versions was, and how easy it suddenly became to explain recursion and recursion-elimination to a 12-year old in a single lesson. Strip C of its historical accidents and you end up with something just as useful as Lisp for teaching recursion.
The episode also helps highlight how the use of a single strategic goto can help clarify a subject. A corollary of homoiconicity is that compiler optimizations too can benefit from being data; it makes them easier to understand. Even as high-level languages provide increasingly abstract representations, it's worth keeping low-level features like goto2 available in the source representation. They let you prototype compiler optimizations without leaving the source language. They help you realize that optimizations are just a more powerful kind of Lisp macro.
1. Or Haskell or one of the functional languages. I just happen to be more familiar with Lisp.
2. Another example: Mu's addresses aren't really like C's pointers. They're as safe as a high-level language. But they let you simulate what happens below Mu without actually leaving Mu.
comments
Comments gratefully appreciated. Please send them to me by any method of your choice and I'll include them here.
This is most definitely NOT how to teach recursion.
One of the biggest problems with teaching recursion is that student confuse it with goto. So they assume a recursive call is equivalent to a goto that goes back to the beginning of the method code.
No, No, NO!!! I have to tell my students, that is NOT what it does. It creates a NEW method call with its OWN environment. So, the variables used in the recursive call will have the same names as in the original method call, but they are NOT the same variables.
Many students get hopelessly confused because they just don't get that. They commonly write recursive code where they suppose if a variable is declared inside the recursive method, there will be one variable of that name shared by all the recursive calls. Explaining recursion in terns of goto will just encourage this fundamental misunderstanding.
Sure, it works for tail recursion. But that's because tail recursion doesn't have the issue of the recursive call terminating and code execution returning to the previous environment. So with tail recursion, the key fact that recursion doesn't have shared variables doesn't make a difference.