Post

Building a Register VM Interpreter - Ch. 11: Variables and Scoping

This guide is about building a register-based VM interpreter from scratch. In this article, we are going to teach the compiler how to navigate both of these worlds. We will implement global variable routing, introduce the mechanics of Lexical Scoping to track blocks, and finally unleash the full speed of our register allocator.

Building a Register VM Interpreter - Ch. 11: Variables and Scoping

Previously, we built a highly optimized Robin Hood Hash Table. The most obvious use for a hash table is maitaining an environment. If a user writes var x = 1, we could theoretically just hash the string x and store the value 1 into a universal hash table. However, if we did this for every variable, the VM would hit a massive performance wall.

In a high-level scripting language, variables are accessed relentlessly. Imagine a simple while loop that increments a counter. If the VM is forced to calculate an xxHash, probe a collision array, and resolve a memory pointer on every single iteration just to read and update that counter, the CPU overhead will completely paralyze the execution speed.

To solve this, a modern virtual machine must split variables into two entirely different types: Globals and Locals.

Globals exist at the top level of a script. Because they can be declared anywhere, accessed from anywhere, and even imported from modules, their exact location in memory cannot be safely predicted during a single pass. Therefore, globals are Late-Bound. They live on the heap. When the VM needs to read a global, it must dynamically searching by its string name at runtime. This is exactly what the Hash Table is used for.

Locals are entirely different. They are confined within a block of code, their lifespan and visibility are completely predictable. The compiler knows exactly when a local is born and exactly when it dies. Therefore, the compiler can perform optimization called Name Erasure. When the compiler parses a local, it does not create a string object. It does not touch the hash table. Instead, it looks at the internal allocator and reserve the next empty register slot (for example, Reg 5). From that moment onward, the compiler intercepts every single time the user typed the local string in that block, and actively hardcodes the integer 5 directly into the emitted bytecode. The VM does not known what local is, it only sees raw register index. Fetching a local completely bypasses the hash and is reduced to a direct, $O(1)$ array lookup.

In this article, we are going to teach the compiler how to navigate both of these worlds. We will implement global variable routing, introduce the mechanics of Lexical Scoping to track blocks, and finally unleash the full speed of our register allocator.

Globals

When a variable is declared at the top level of the script, it becomes a global. The parser can easily knows it is at the top level simply by checking if the current scope depth is zero.

Let’s look at how the compiler processes a statement like var x = 1;.

When the compiler meets the var token, it routes execution to varDecl() in compiler/compiler.c.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
static void varDecl(Compiler* compiler){
    int global = parseVar(compiler, "Expect variable name.");

    if(compiler->scopeDepth > 0){
        int reg = compiler->locals[compiler->localCnt - 1].reg;
        if(match(compiler, TOKEN_ASSIGN)){
            ExprDesc initExpr;
            expression(compiler, &initExpr);
            expr2Reg(compiler, &initExpr, reg);
        }else{
            emitABC(compiler, OP_LOADNULL, reg, 0, 0);
        }
        consume(compiler, TOKEN_SEMICOLON, "Expect ';' after variable declaration.");
        defineVar(compiler, global);
    }else{
        if(match(compiler, TOKEN_ASSIGN)){
            ExprDesc initExpr;
            expression(compiler, &initExpr);
            storeVar(
                compiler,
                &(ExprDesc){
                    .type = EXPR_GLOBAL,
                    .data.loc.index = global
                },
                &initExpr
            );
        }else{
            int tmp = getFreeReg(compiler);
            emitABC(compiler, OP_LOADNULL, tmp, 0, 0);
            emitABx(compiler, OP_SET_GLOBAL, tmp, global);
        }
        consume(compiler, TOKEN_SEMICOLON, "Expect ';' after variable declaration.");
        defineVar(compiler, global);
    }
}

The very first thing is figuring out the name of the variable by parserVar().

1
2
3
4
5
6
7
8
static int parseVar(Compiler* compiler, const char* err){
    consume(compiler, TOKEN_IDENTIFIER, err);
    if(compiler->scopeDepth > 0){
        declLocal(compiler);
        return 0;
    }
    return identifierConst(compiler);
}

If you trace parseVar down the identifierConst, you will see exactly how this happens:

1
2
3
4
5
static int identifierConst(Compiler* compiler){
    Token* name = &compiler->parser.pre;
    Value strVal = OBJECT_VAL(copyString(compiler->vm, name->head, name->len));
    return makeConstant(compiler, strVal);
}

The compiler takes the raw text of the token (name->head), allocates a dynamic ObjectString on the heap, and inserts the string into the constant array.

To the VM, the name of a global is treated exactly same as a string literal.

Emitting Bytecode

Once the compiler has stored the name of the global, it compiles the initializer expression (=1;) into a temporary register. Finally, it delegates to storeVar to emit the instruction:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
static void storeVar(Compiler* compiler, ExprDesc* var, ExprDesc* val){
    switch(var->type){
        // ...
        case EXPR_GLOBAL:{
            expr2NextReg(compiler, val);
            emitABx(
                compiler, 
                OP_SET_GLOBAL, 
                val->data.loc.index, 
                var->data.loc.index
            );
            break;
        }
        // ...
        default:
            errorAt(compiler, &compiler->parser.pre, "Invalid assignment target.");
            break;
    }

    freeExpr(compiler, val);
}

This emits an OP_SET_GLOBAL instruction. It takes two arguments:

  • A: The register holding the evaluated value (e.g., 1).

  • Bx: The constant index holding the string name ("x").

At runtime, the VM executes OP_SET_GLOBAL by reading the string from the constant pool, grabbing the value from the Register A, and invoking the hash table th insert the key-value pair into vm->globals:

1
2
3
4
5
DO_OP_SET_GLOBAL:
{
    ObjectString* name = AS_STRING(K(GET_ARG_Bx(instruction)));
    tableSet(vm, vm->curGlobal, OBJECT_VAL(name), R(GET_ARG_A(instruction)));
} DISPATCH();

Reading a global variable follows the exact same reverse path. When the compiler parses a raw identifier like x in an expression, and determines it is not a local variable, it emits an OP_GET_GLOBAL. The VM takes the string from the constant table, hashes it, probes vm->globals, and moves the found value into a destination register.

1
2
3
4
5
6
7
8
9
10
11
12
DO_OP_GET_GLOBAL:
{
    ObjectString* name = AS_STRING(K(GET_ARG_Bx(instruction)));
    Value value;
    if(!tableGet(vm, vm->curGlobal, OBJECT_VAL(name), &value)){
        if(!tableGet(vm, &vm->modules, OBJECT_VAL(name), &value)){
            runtimeError(vm, "Undefined global variable '%s'.", name->chars);
            return VM_RUNTIME_ERROR;
        }
    }
    R(GET_ARG_A(instruction)) = value;
} DISPATCH();

The Cost

Every time the VM executes OP_GET_GLOBAL, it should:

  • Fetch the instruction

  • Decode the Bx

  • Look up the ObjectString pointer

  • Read the vlaue

  • Write register

If a developer writes a loop that incremetns a global counter thousands of times, the VM is forces to execute the massive lookup process for thousands of times just to maintain a counter.

This architectural bottleneck is exactly why modern languages minimize global scope. To achieve high performance, we need a way to completely bypass the hash table for temporary, block-level data. We need to inplement the Early-Bound Lexical Scoping and Local.

Lexical Scoping

A crucial concept to understand in VM is that the Virtual Machine knows absolutely nothing about scopes. Scopes, blocks, and curly braces do not exist at runtime. Lexical scoping is a shield constructed entirely by the compiler to manage the memory.

To track this, Compiler struct maintains an integer int scopeDepth. Whenever the parser encounters a {, it calls beginScope():

1
2
3
static void beginScope(Compiler* compiler){
    compiler->scopeDepth++;
}

When the block meets a }, it calls the endScope() to decrements the scopeDepth:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
static void endScope(Compiler* compiler){
    compiler->scopeDepth--;
    /*
    while(compiler->localCnt > 0 && 
        compiler->locals[compiler->localCnt-1].depth > compiler->scopeDepth){
            int closedReg = compiler->locals[compiler->localCnt - 1].reg;
            emitABC(compiler, OP_CLOSE_UPVAL, closedReg, 0, 0);
            compiler->localCnt--;
    }
    */

    if(compiler->localCnt > 0){
        compiler->freeReg = compiler->locals[compiler->localCnt - 1].reg + 1;
    }else{
        compiler->freeReg = 1;
    }
}

Any variable declared while scopeDepth > 0 is a local variable.

Local Array

Because local variables aren’t written to the global hash table, the compiler needs a way to remember them while it is compiling the block. It does this using a simple, flat array of Local structs in compiler/compiler.h:

1
2
3
4
5
typedef struct{
    Token name;
    int depth;
    int reg;
}Local;

As the compiler marches through the source code, it pushes new variables onto the compiler->locals array. It records the exact text of the token (the variable’s name), the current scopeDepth at the time it was declared, and most importantly, the physical register assigned to it.

Locals

Let’s see what happens when the compiler parses var y = 0; inside a block. Because the scopeDepth > 0, the varDecl() routes executuion to declLocal() instead of creating a string constant:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
static void varDecl(Compiler* compiler){
    int global = parseVar(compiler, "Expect variable name.");

    if(compiler->scopeDepth > 0){
        int reg = compiler->locals[compiler->localCnt - 1].reg;
        if(match(compiler, TOKEN_ASSIGN)){
            ExprDesc initExpr;
            expression(compiler, &initExpr);
            expr2Reg(compiler, &initExpr, reg);
        }else{
            emitABC(compiler, OP_LOADNULL, reg, 0, 0);
        }
        consume(compiler, TOKEN_SEMICOLON, "Expect ';' after variable declaration.");
        defineVar(compiler, global);
    }else{

    // ...

    }

// ...

}

The compiler maintains a running tally of available registers. When a local is born, the compiler simply hands it the next available slot. If freeReg was 5, the y is assigned to register 5. The compiler then calls reserveReg(compiler, 1), incrementing freeReg to 6 so the next operation doesn’t overwrite it.

Name Erasure

The name y is only saved in the Local struct so that the compiler can recognize it later in the same block. It is never emitted into the bytecode.

When the user writes a i = i + 1;, the compiler searches its locals array, finds that the i is tied to Register 5 (for example), and blindly emits bytecode that says: OP_ADD 5 5 1 (which means add the value in register 1 to register 5, and store it back in register 5).

To the executing VM, there are no variables. There is only a flat array of registers being manipulated. This completely eliminates the hash table lookups, making locals access fast.

Look closely at how handleVar() delegates the lookup:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
static void handleVar(Compiler* compiler, ExprDesc* expr, bool canAssign){
    Token* name = &compiler->parser.pre;

    int index = resolveLocal(compiler, name);
    if(index != -1){
        initExpr(expr, EXPR_LOCAL, index);
    }
    /*
    else if((index = resolveUpvalue(compiler, name)) != -1){
        initExpr(expr, EXPR_UPVAL, index);
    }else{
        index = identifierConst(compiler);
        initExpr(expr, EXPR_GLOBAL, index);
    }
    */
}

Instead of hashing the name y, the compiler passes the raw token to resolveLocal(). This function loops backward through the compiler->locals array to find a matching string:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static int resolveLocal(Compiler* compiler, Token* name){
    // search matched local variable
    for(int i = compiler->localCnt-1; i >= 0; i--){
        Local* local = &compiler->locals[i];
        if(name->len == local->name.len && 
            memcmp(name->head, local->name.head, name->len) == 0){
                if(local->depth == -1){
                    errorAt(compiler, name, "Cannot read local variable");
                }
                return local->reg;
        }
    }
    return -1;
}

Notice what resolveLocal() actually returns: return local->reg;. It does not return the string, and it does not return the Local struct itself. It solely returns the integer representing the physical register slot (e.g., 5).

Back in handleVar(), the compiler packages that integer into an ExprDesc state: initExpr(expr, EXPR_LOCAL, index);. This is the exact moment the name is erased.

From this point forward in the compilation pipeline, the string y is completely forgotten. When the compiler finished parsing the +1 and is ready to emit the addition instruction, it just looks at the ExprDesc, sees EXPR_LOCAL with a register index 5, and blindly writes an OP_ADD instruction to the Chunk using 5 as the operand.

The name is never allocated on the heap, never written to the constant table, and never seen by the VM at runtime.

Variable Resolution & Shadowing

By completely erasing the name of a local variable and replacing it with a register index, the compiler guarantees absolute speed during execution. But this name erasure introduces a logistical question for the compiler: How does it know which register to use if the user reuses a variable name?

In modern programming languages, you can declare a variable with the same name in an inner block, temporarily shadowing the outer one.

For example:

1
2
3
4
5
6
7
8
9
var x = "global";
{
    var x = "outer";
    {
        var x = "inner";
        print x;         # Prints "inner"
    }
    print x;             # Prints "outer"
}

To see the compiler handles this, we have to look at the Scope Hierarchy implemented in the handleVar():

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static void handleVar(Compiler* compiler, ExprDesc* expr, bool canAssign){
    Token* name = &compiler->parser.pre;

    int index = resolveLocal(compiler, name);
    if(index != -1){
        initExpr(expr, EXPR_LOCAL, index);
    }
    // else if((index = resolveUpvalue(compiler, name)) != -1){
    //     initExpr(expr, EXPR_UPVAL, index);
    // }
    else{
        index = identifierConst(compiler);
        initExpr(expr, EXPR_GLOBAL, index);
    }
}

This if/else chain is the gatekeeper of variable resolution. When the parser encounters an identifier like y, it strictly checks the innermost world first.

But what about local variables shadowing other local variables?

Notice the loop initialization in resolveLocal:

1
for(int i = compiler->localCnt-1; i >= 0; i--)

The compiler->locals array acts as a contiguous stack. As the compiler dives deeper into nested blocks and encounters new variables, it pushes them to the end of the array (higher indices).

When it is time to resolve a name, the compiler does not search from the beginning. It searches backward from localCnt - 1 down to 0. By walking backward through history, the compiler is mathematically guaranteed to bump into the most recently declared variable matching that name first.

As soon as memcmp finds a match, the function instantly returns that variable’s local->reg and aborts the loop. The older, shadowed variables sitting at lower indices in the array are completely ignored. You don’t need a complex tree of scope nodes or nested hash maps to handle lexical scope. A single, flat C-array and a reverse for loop inherently solve the entire hierarchy problem with zero overhead.

Exiting the Scope

Variables are born when a scope opens, and they must die when a scope closes. When the parser hits a closing brace, it signals the end of the current block by calling endScope().

This function performs two critical cleanup operations: it wipes the variables from the compiler’s memory, and it physically reclaims their assigned register slots.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static void endScope(Compiler* compiler){
    compiler->scopeDepth--;
    while(compiler->localCnt > 0 && 
        compiler->locals[compiler->localCnt-1].depth > compiler->scopeDepth){
            int closedReg = compiler->locals[compiler->localCnt - 1].reg;
            emitABC(compiler, OP_CLOSE_UPVAL, closedReg, 0, 0);
            compiler->localCnt--;
    }

    if(compiler->localCnt > 0){
        compiler->freeReg = compiler->locals[compiler->localCnt - 1].reg + 1;
    }else{
        compiler->freeReg = 1;
    }
}

Because we add variables sequentially, all variables declared in the dying scope will be grouped together at the very top of the array. The compiler loops backwards, checking if the variable’s recorded depth is strictly greater than the new scopeDepth. If it is, that variable is out of bounds. The compiler literally just substracts, popping it off the stack. You might notice the emitABC(compiler, OP_CLOSE_UPVAL...) instruction here. While the compiler forgets the variable name, the VM might need to keep the actual data alive if an inner closure captured it. We emit this instruction to warn the VM to move the data from the register to the heap before it gets overwritten. We will dive deep into this mechanism in the later Closures chapter.

Register Reuse

Because every local variable is deterministically tied to a physical register index, when a block ends, all registers utilized by that block are suddenly freed. The compiler doesn’t need to track individual registers. It simply looks at the last surviving variable in the outer scope (compiler->locals[compiler->localCnt - 1]), checks which register it occupies, and resets compiler->freeReg to the very next slot (.reg + 1).

Imagine an outer block uses register 1 and 2. An inner block then opens and uses register 3 and 4. Exiting the inner block instantly rollback the freeReg back to 3. Then next inner block of code will simply overwrite register 3 and 4 with entirely new data.

Wrapping UP

Now that we have the infrastructure to safely store and retrieve data from isolated, reusable registers, we are ready to take these registers and package them into sliding “windows” of execution.

This post is licensed under CC BY 4.0 by the author.