A Recipe For Closure

People call all kinds of things “closures” that most definitely are not. Most recently I was reading the official book for rust and I thought I had woken up in an alternate dimension. Surely these very serious developers with one of the most popular new languages in the world wouldn’t confuse what closures are with lambda functions, would they? I went to wikipedia and breathed a sigh of relief because that definition still agreed with mine.

I’ve been using questions about closures as a good way to judge how advanced a candidate is in Javascript, but everyone gets confused about what I’m asking. Maybe I’m asking it wrong. So, for my own benefit, I want to explain what closures are through examples, and talk about the two language features that make it possible.

Function Variables (Lambdas)

More and more languages support functional programming features these days. Rust certainly does, albeit in its own strange way. Java added a bunch of useful functional programming stuff in version 8. Probably the most basic of these features is the ability to declare functions within other functions, assign them to variables, and pass them around to other functions.

In OOP, everything needs to be attached to an object. Crazy idea here, but objects are useless without functions. So the basic building blocks of OO are the functions and not the objects? And maybe some functions are so strong and independent that they don’t need to belong to objects. OO languages overcome this by making functions themselves objects, but only because they know no other way.

Here’s Java declaring a function variable:

class WhoCares {

  // Not a typo, Java cares strongly about types, so this function only works with Integers
  public static void interate(Function<Integer, Integer> loop, int n, int initalValue) {
    int value = initialValue;

    for (i = 0; i < n; i++) {
      value = loop.apply(value);
    }
  }

  public static void main() {
    interate(i -> i + 1, 5, 0); // Returns 4
    interate(i -> i == 1 ? i - 1 : i + 1, 2, 0); // Returns 0
  }

}

Here’s a function that executes a function n times. Of course, it needs some starting value to work with as well. Lambdas in Java are declared like (arg1, arg2, ...argN) -> {expression that maybe returns a value}. The compiler figures out how to turn these into objects and it’s all a big mess I don’t want to get into. JavaScript is much easier since functions are first class citizens and have been since the ‘90s.

// JavaScript is very loose with its typing, which is powerful and dangerous and confusing
function iterate(loop, n, initialVaue) {
  var value = initialValue;

  for (var i = 0; i < n; i++) {
    value = loop(value);
  }

  return value;
}

// remember the '90s, returns 4
iterate(function addOne(i) { return i + 1; }, 5, 0);

// modern, screaming
iterate(value => value + value, 5, "a");

Simply declaring a function does not execute the function. Nothing happens until someone invokes it, and it is only then that the parameters are known. In the above example, the first parameter loop is invoked within a for-loop the output of its last run fed in. When you’re reading this code, you have to mentally insert the body of the lambda like i -> i + 1 into the place where it’s eventually called to understand what’s going on. Let’s unwrap iterate(value => value + value) real quick:

// This example bakes in the function parameter { value => value + value }
function iterate(n, initialValue) {
  var value = initialValue;

  for (var i = 0; i < n; i++) {
    value = value + value;
  }

  return value;
}

If you work with these concepts enough, you’ll build intuition about them pretty quick.

That’s about all I want to say about these function variables. They’re powerful tools that open up new possibilities for asynchronous programming, improve testability, and let you build monads. The common “arrow” syntax for declaring them lets you build very expressive code. They’re also essential for implementing closures, but they’re just the first building block. To close the loop, we need the tail for the snake to bite.

(As an aside, function variables are what rust refers to as closures. I’m still mystified as to why.)

Object-Oriented Scoping vs. Lexical Scoping

Java and other OO languages traditionally restrict scope to one of the following categories:

As a Java or C# programmer, you probably have some intuition already when you see a variable as to whether or not it’s in scope when you refer to it.

class JavaAgain {
  private int member;

  public void getMember() { // Legal
    return member;
  }

  public static void getMemberStatic() { // Illegal
    return member;
  }

  public static void getFive() { // C'mon
    int member = 5;
    return member;
  }
}

Lexical scoping allows for more than this. It harkens back to a simpler time when programs changed when you changed the order in which functional components arrived in a file, such as in C.

int addSevenMore(int n) {
  return addFive(n) + 7;
}

int addFive(int n) {
  return n + 5;
}

This code doesn’t work because addFive is declared after addSevenMore tries to reference it. To solve this, C has header files where you put function stubs like int addFive(int n);. Then you put the bodies in a .c file so you can write code in any order.

Here’s the definition of “lexical” from a google search:

adjective

  1. relating to the words or vocabulary of a language. (“lexical analysis”)

  2. relating to or of the nature of a lexicon or dictionary. (“a lexical entry”)

I guess if you squint real hard and substitute “symbols” or “variables” in for “words”, and “source code” in for “lexicon”, it starts to make some sense. This type of scoping relies on the layout of variables within the source code. It might seem a bit archaic to look at it like that in comparison to the beautiful abstraction imposed by OO languages, but trust me when I say it has some powerful implications.

And besides, OO languages started doing it too, albeit with some restrictions.

Lexical Scope and Nested Functions

JavaScript’s lexical scoping is the flavor you’re most likely to actually use, so I’ll focus on it for this section.

Consider the following code:

function outer() {
  var x = 0;

  function middle() {
    var y = 1;

    function inner() {
      var z = 2;

      return x + y + z;
    }

    return inner();
  }

  return middle();
}

console.log(outer());

In function inner, which takes no parameters at all, it refers to variables x and y which are in the function scope of functions which are not inner. Lexical scoping is at work here. Because the function inner is declared inside function middle, it has access to variables in middle. JavaScript does a read-through of code before executing it, so it knows all the variable names that will exist, so you can refer to variables that come after.

So how does JavaScript do this? It keeps a dictionary of variables from all levels of the call stack. If you ask for a variable, it walks all the way up the call stack till it reaches the top (which is window in your browser or global in node). If it finds something, that’s what you get.

When a function finishes executing, that function’s stack frame is done and variables inside it are marked as ready for garbage collection. JavaScript only walks up the stack to find your variables. It doesn’t walk down.

Here’s something which does not work, to contrast and illustrate the lexical-ness of my previous example:

function outer() {
  var x = 0;

  return middle();
}

function middle() {
  var y = 1;

  return inner();
}

function inner() {
  var z = 2;
  
  return x + y + z;
}

console.log(outer());

Everything works until return x + y + z. For instance, when middle refers to inner, JS doesn’t find it in local scope, so it walks up the call stack to the outer scope where all three of my functions are declared, finds inner, and done. However, there is no x or y declared in that outer scope. inner can’t see them because they’re not in the same lexical scope.

If some other code on my page declared variables x or y in the global scope, and my three functions were also in the global scope, inner would get those versions of the variables, instead of the ones I intended. That’s one of the dangerous implications of this type of scoping, but there are ways to completely solve that problem.

Anyway, maybe you’ve already noticed that there’s an interesting intersection of functionality between function variables and lexical scoping. That’s where closures come from.

Finding Closures

Consider this example:

function constructor(initialValue) {
  var heldValue = initialValue;

  function incrementHeld() {
    heldValue += 1;
    return heldValue;
  }

  return incrementHeld;
}

var incrementHeld = constructor(1);

for (var i = 0; i < 10; i++) {
  console.log(incrementHeld());
}

When you call constructor, it saves the value you pass in. Then, it gives you back a function that you can call to increment and return that value. My for-loop cannot access heldValue directly. The function constructor has finished executing by the time the for-loop executes. By the rules laid out for lexical scoping, once a function’s execution is over, JS clears out the variables it was referring to… or does it?

Let’s say we’re designing a language and writing requirements for it. We write requirements for nested functions and lexical scoping. Now the developer comes back to us during implementation and says “hey, what happens when you maintain a reference to a function that refers to a variable in an outer lexical scope?” Our initial thought might be to just say “well, that reference doesn’t work anymore” because it’s simpler. But our developer says “it’s no problem to keep track of that variable though”. Then we can start thinking about what keeping these variables around longer would mean.

This phenomenon is known as closure. JS does not release memory for variables that are referred to inside of scope that is still alive. These variables are “closed over”. In my example, because constructor returns a reference to incrementHeld, its scope is considered to be alive. Even if JS doesn’t execute incrementHeld inside constructor, it still parses and understands it. It knows that incrementHeld refers to the variable heldValue, so it keeps a reference to it, and its relative position in the stack. The precise details are beyond the scope of this article, which is meant to be more general than just JS.

Improving Java

I don’t want to spend too long here, but closures vastly improve some parts of Java that are traditionally awful. Like threads.

ExecutorService executor = new ExecutorServiceImplementation();

int someState = 1;
ServiceClient client;

executor.submit(() -> client.post(someState));

This example glosses over many things, like handling the response from client.post. In a nutshell, it’s making an async POST request with my theoretical service client, and passing in a variable. The single argument to ExecutorService.submit is a function, and because of closures, it can refer to variables declared in a lexical scope. In this case, both client and someState are closed over. Java has the restriction that variables in closure must be final, or effectively final, meaning you cannot reassign them within the same lexical scope. It’s a fairly easy rule to follow, considering this is Java we’re talking about.

Here’s how you’d have to do it without closures:

class BusinessLogic {

  void logic() {
    ExecutorService executor = new ExecutorServiceImplementation();

    int someState = 1;
    ServiceClient client;

    executor.submit(new ServicePoster(client, someState));
  }

  private static class ServicePoster implements Runnable {
    private final ServiceClient client;
    private final int postValue;

    public ServicePoster(ServiceClient client, int postValue) {
      this.client = client;
      this.postValue = postValue;
    }

    @Override
    public void run() {
      client.post(postValue);
    }
  }
}

You’d need a new Runnable object for each method on your service client you wanted to execute in this manner. Maybe you’d make a generic one with a factory to produce Runnables, but regardless, it’s a mess compared to closures. The closure example also demonstrates how expressive these things can be.

A Closure Interview Question

You don’t have to know all the details about closures to know that they work, and to know it when you see it. Of course I encourage people to go deeper on these topics, which is why I’m writing this article. The superficial knowledge is useful and, I’d argue, required for any JavaScript developer. With that in mind, here’s my very simple interview question about closures in JS:

function toy() {
  var x = 0;

  setTimeout(function () { console.log(x); }, 100);
  x += 1;
}

toy();

I ask the following questions:

  1. What gets logged in the console and why?
  2. What happens when I change the second argument to setTimeout to 0?
  3. Point to the closure.

And these are the answers:

  1. It logs “1” because setTimeout executes 100ms (roughly) after x is incremented.
  2. It still logs “1”. The explanation can get pretty deep and involves the event loop if they’re an advanced JS developer.
  3. The closure is the reference to x inside the function parameter to setTimeout.

Why I Don’t Like This Question

It tests only knowledge, not application. I haven’t formulated a good JS interview question that asks the candidate to solve that requires the use of these functional programming features. Those sorts of interview questions are much more useful, but they also take longer to develop and test against candidates.

Appendix - Rust’s closures

Rust supports closures in a very odd way, in my opinion. I am not a rust developer, and I don’t know the implications of these restrictions on code people will actually write. So that’s my disclaimer. I’m not criticizing rust for anything except referring to function variables as closures, which is actually wrong. I’m just pointing out that I’ve never seen this sort of restriction before.

Rust is a systems programming language that is designed to be very “safe”. One of the ways they accomplish this is by clearly defining the ownership of variables. A variable is owned by one and only one scope at a time. Ownership transfer is possible, but restricted. Here’s the example from their documentation:

fn main() {
    let x = vec![1, 2, 3];

    let equal_to_x = move |z| z == x;

    println!("can't use x here: {:?}", x);
}

So they define a vector and assign it to x, then define a function equal_to_x that takes a parameter and compares it to x. The move keyword tells rust to move x into equal_to_x. The implication of this is that the println statement fails. main doesn’t own x anymore and can’t use it after equal_to_x is defined. That’s very lexical, and very weird. It’s also a closure because you can return a reference to equal_to_x or pass it off to another function. equal_to_x uses a closure, but is not itself a closure. And that’s the end of that.