Post

Building a Register VM Interpreter - Part 9: The Control Flow

This guide is about building a register-based VM interpreter from scratch. We will upgrade the VM to understand Jump instructions, and more importantly, we will teach the single-pass compiler how to calculate exactly where those jumps should land, even when it hasn't finished compiling the code yet.

Building a Register VM Interpreter - Part 9: The Control Flow

Right now this is not yet a programming language, it is just an over-engineered calculator. This interpreter can currently take a linear sequence of mathematical instructions, push data into the VM, crunch the numbers, and print the result to the screen.

A true programming language must have the ability to handle the uncertainty. It must be able to make dicisions based on conditions (if/else), and it must be able to repeat tasks until a specific condition is met (while/for). In computer science, a system that can simulate any arbitrary logic or algorithm, provided it has enough time and memory, is considered Turing Complete.

But here is where the illusion of high-level programming breaks down. When we write code in C, Java, or this language, we use visual structures to represent control flow. We use curly braces {}, indentation, and keywords to create blocks of code. We think of the if statement as a “box”” that the program either enters or not.

To a VM, much like a physical CPU, knows nothing about curly braces, blocks, or scopes. A program is just a flat, contiguous array of 32-bit integers, the Chunk of bytecode. The only thing driving the executation is the Instruction Pointer (ip), a tiny C pointer. Therefor, at the lowest architectural level, control flow is not about entering or exiting a “box”. It is about Teleportation. To inplement an if statement, we don not wrap the code in a special container, we simply instruct the ip to skip over a part of memory. To inplement a while loop, we do not create a circular structure, we just instruct the ip to jump backward to an address and run again.

In this article, we are going to bridge this massive gap between high-level syntax and low-level memory jumping. We will upgrade the VM to understand Jump instructions, and more importantly, we will teach the single-pass compiler how to calculate exactly where those jumps should land, even when it hasn’t finished compiling the code yet.

The Jump Instructions

Before teaching the compiler how to route control flow, we must firste equip the VM with the physical capability to perform jumps.

As we established, the execution of the program is entirely driven bu the ip. In the run() dispatch loop, the DISPATCH() macro automatically fetches the current instruction and increments the ip pointer to the next one in the array. If we want to skip a block of code, like jumping over an else block, or repeat a block of code, all we have to do is manually move the ip pointer by adding or subtracting an offset.

To achieve this, we will have to introduce two kinds of jump operations to the VM’s dispatch table:

  • Uncoditional Jump: OP_JMP, it always move the pointer.

  • Conditional Jump: OP_JMP_IF_FALSE and OP_JMP_IF_TRUE, they moves the pointer only if a specified register is holding a falsy/truthy value.

The Uncoditional Jump

Let’s start with the simplest of these jumps: OP_JMP.

Because a jump needs to tell the ip exactly how far to move, it requires an operand that represents an offset. Furthermore, we need to jump both forward and backward, this offset must be a signed integer.

In this 32-bit instruction format, we use the sBx argument format. This utilizes the 18 bits normally shared by the B and C operands to store a single, large, and singed integer, giving us a quite enough maximum jump range of roughtly $\pm$131071 instructions.

Here is the implementation inside the run() fuction invm/vm.c:

1
2
3
4
5
DO_OP_JMP:
{
    int sBx = GET_ARG_sBx(instruction);
    frame->ip += sBx;
} DISPATCH();

It is beautifully simple. We extract the signed offset sBx using our decoding macros, and we simply add it to the current frame’s Instruction Pointer frame->ip. If sBx is 5, the VM skips the next 5 instructions. If sBx is -10, it rewinds execution by 10 instructions.

Conditional Jumps

Unconditional jumps are great for loops and skipping else blocks, but the core of decision-making lies in checking a condition first.

For OP_JMP_IF_FALSE and OP_JMP_IF_TRUE, the instruction needs two pieces of information:

  • Which register is holding the condition we want to evaluate

  • How far should we jump

We use the 8-bit A argument to store the target register index, and we use the 18-bit sBx argument to store the offset:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
DO_OP_JMP_IF_FALSE:
{
    int reg = GET_ARG_A(instruction);
    int sBx = GET_ARG_sBx(instruction);
    if(!isTruthy(R(reg))){
        frame->ip += sBx;
    }
} DISPATCH();

DO_OP_JMP_IF_TRUE:
{
    int reg = GET_ARG_A(instruction);
    int sBx = GET_ARG_sBx(instruction);
    if(isTruthy(R(reg))){
        frame->ip += sBx;
    }
} DISPATCH();

We fetch the condition dynamically at runtime using the R() macro, which accesses the current call frame’s register window. If the jump condition is met, we add sBx to the ip.

The isTruthy is a helper function in vm/vm.c:

1
2
3
4
5
6
static bool isTruthy(Value value){
    return !(IS_NULL(value) ||
            (IS_BOOL(value) && !AS_BOOL(value)) ||
            (IS_NUM(value) && AS_NUM(value) == 0));
    // NULL is considered falsy
}

In strongly typed languages like C, conditions often must evaluate strictly to integers or booleans. But in dynamically typed languages, users often write code like if (user_name) or if (0), expecting the language to implicitly convert strings, objects, or numbers into boolean logic. According to this function, a null value is falsy, a boolean false is falsy (naturally), the number value 0 is also falsy. Everything else is truthy. Strings, lists, objects, and non-zero numbers will all successfully trigger an if block.

With the engine now fully capable of routing execution based on dynamic conditions, the heavy lifting shifts entirely to the compiler. We now have to figure out how to calculate those sBx offsets on the fly.

Backpatching

With the VM now equipped to jump the memory, the responsibility shifts entirely to the compiler. The compiler’s job is to read the source code and emit proper jump instructions with calculated sBx offsets.

If we were building a compiler with a multi-pass mechanism that constructed an AST, this would be trivial. The compiler could simply look at the IfNode, traverse the entire body branch to count exactly how many instructions it contains, and then emit the jump with clear foresight.

But we are building a single-pass compiler, a Pratt parser. We do not have an AST tree, and we have absolutely no foresight. We parse a token and we emit a bytecode immediately. This achitecture choise brings a massive temporal paradox when compiling control flow.

Let’s take a look at this example:

1
2
3
if(user.age < 18){
    print "Access Denied";
}

The compiler reads if, parses the condition user.age < 18, and evaluates it into a temporary register. Because we execute instructions linearly, the conpiler must immediately emit an OP_JMP_IF_FALSE here, so the VM can know to evaluate the jump before hitting the branch body. One thing we should know is how far we should jump. The compiler is forced to emit a jump instruction without knowing the destination.

EXPR_TBD

We have actually encountered a variation of this delayed execution problem before. In the previous part, when compiling an expression like x + 1 before knowing where the result was supposed to go, we introduced EXPR_TBD state.

Instead if guessing a destination register (actually should not), the compiler emitted an incomplete OP_ADD with 0 as the target, saved the exact index of that instruction in the Chunk array, and reached back into the array to patch the correct register later.

Now we are going to use the same technique for control flow, a classic compiler trick called Backpatching.

Lifecycle

We just simply emit a dummy instruction and leave a placeholder.

When the compiler hits the point where a jump is required, it emits the OP_JMP_IF_FALSE instruction, but it passed a 0 for the sBx offset. The compiler now records the current index of the Chunk’s instruction array where this dummy jump was just written. Then, the compiler merrily continues parsing the body block ({ ... }). As it compiles the statements inside the block, it naturally appends more and more instructions to the Chunk, advancing the internal counter.

Once the compiler hits the closing }, the body is over. It now knows exactly how many instructions were generated. It calculates the distance, and finally reaches back in time, directly access the Chunk, safely overwrites the dummy 0 with the actual calculated distance.

To make this happen, we define two helper functions in the compiler backend:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
static int emitJmp(Compiler* compiler){
    int instructionIndex = emitAsBx(compiler, OP_JMP, 0, 0);
    return instructionIndex;
}


static void patchJump(Compiler* compiler, int jmpIndex){
    int jmpTarget = compiler->func->chunk.count;
    int offset = jmpTarget - jmpIndex - 1;
    if(offset > MAX_BX >> 1){
        errorAt(compiler, &compiler->parser.pre, "Too much code to jump over.");
    }
    Instruction* instruction = compiler->func->chunk.code;
    Instruction prevInstruction = instruction[jmpIndex];
    instruction[jmpIndex] = CREATE_AsBx(GET_OPCODE(prevInstruction), GET_ARG_A(prevInstruction), offset);
}

By separating the emitJump and patchJump operations, the compiler can confidently route control flow around code that hasn’t even been parsed yet.

Forward Jumps: if/else

With the emitJmp and patchJmp functions in the compiler’s toolbox, we are finally ready to parse the first control flow structure. Let’s start with a simple, standalone if statement without an else branch.

When the scanner consumes the if token, it hands the control over to the compiler’s ifStmt parsing function. The compiler’s task is to evalute the condition. It parse the expression inside the parenthese, generating the logic, and leaves an ExprDesc holding the result.

Peephole Optimization

In the previous part, we noted that comparison operations like OP_LT act as Test and Skip. To generate a boolean value for an expression like a < b, the compiler actually emits three instructions: OP_LT 1 A B, OP_LOADBOOL reg 1 1, and OP_LOADBOOL reg 0 0.

If the user writes if(a < b), forcing the VM to execute these three instructions just to push a boolean into the register, only to read the register with an OP_JMP_IF_FALSE, is highly inefficient.

To solve this, the compiler inspect the last three instructions written to the Chunk. If the compiler detects that the condition was a direct comparison, it reaches back into the bytecode array and actively rewrites history. It physically deletes the two OP_LOADBOOL instructions by decrementing the chunk count (compiler->func->chunk.count -= 2) and frees the temporary register. We don’t need a boolean register at all. Then, it overwrites the OP_LT/LE/EQ instruction in the array, changing its A operand to 0. The instruction now expects the comparison to be false. Instead of OP_JMP_IF_FALSE, it emits a standard OP_JMP and saves the index to elseJmp.

If the condition is not a comparison (e.g., if(isValid)), the compiler falls back to the standard elseJmp = emitJmpIfFalse(compiler, condReg);.

Here’s how the compiler routes it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
static void ifStmt(Compiler* compiler){
    consume(compiler, TOKEN_LEFT_PAREN, "Expect '(' after 'if'.");
    ExprDesc condition;
    expression(compiler, &condition);
    consume(compiler, TOKEN_RIGHT_PAREN, "Expect ')' after condition.");

    expr2NextReg(compiler, &condition);
    int condReg = condition.data.loc.index;
    int elseJmp = -1;

    int codeCnt = compiler->func->chunk.count;
    if(codeCnt >= 3){
        Instruction instFalse = compiler->func->chunk.code[codeCnt - 1];   // loadbool false
        Instruction instTrue = compiler->func->chunk.code[codeCnt - 2];    // loadbool true
        Instruction instCmp = compiler->func->chunk.code[codeCnt - 3];     // jump if false

        if(GET_OPCODE(instFalse) == OP_LOADBOOL && GET_OPCODE(instTrue) == OP_LOADBOOL){
            OpCode cmpOp = GET_OPCODE(instCmp);
            if(cmpOp == OP_LT || cmpOp == OP_LE || cmpOp == OP_EQ){
                compiler->func->chunk.count -= 2;
                freeRegs(compiler, 1);  // free targetReg

                // expect False
                int argA = GET_ARG_A(instCmp);
                int argB = GET_ARG_B(instCmp);
                int argC = GET_ARG_C(instCmp);
                compiler->func->chunk.code[codeCnt - 3] = CREATE_ABC(cmpOp, 0, argB, argC);

                elseJmp = emitJmp(compiler);
            }
        }
    }

    if(elseJmp == -1){
        elseJmp = emitJmpIfFalse(compiler, condReg);
    }

    freeExpr(compiler, &condition);
    stmt(compiler);
    if(match(compiler, TOKEN_ELSE)){
        int endJmp = emitJmp(compiler);
        patchJump(compiler, elseJmp);
        stmt(compiler);
        patchJump(compiler, endJmp);
    }else{
        patchJump(compiler, elseJmp);
    }
}

Compiling the Branches

With elseJmp securely holding the index of the dummy jump, the compiler proceeds to compile the then branch.

1
2
freeExpr(compiler, &condition);
stmt(compiler);

It blindly emits the if body instructions to the Chunk. Once stmt() returns, we must handle the potential else branch.

1
2
3
4
5
6
7
8
if(match(compiler, TOKEN_ELSE)){
    int endJmp = emitJmp(compiler);
    patchJump(compiler, elseJmp);
    stmt(compiler);
    patchJump(compiler, endJmp);
}else{
    patchJump(compiler, elseJmp);
}

If an else branch exists, we must inject an escape hatch (int endJmp = emitJmp(compiler);) at the end of the if body so execution doesn’t fall through.

Crucially, we patch elseJmp right before parsing the else body. This ensures that if the initial condition is false, the VM jumps over the if body and the escape hatch, landing perfectly at the first instruction of the else block. After the else block finishes compiling, we reach back to patch the endJmp escape hatch.

If there is no else block, it’s simple: we just patch elseJmp to point to the current end of the bytecode array.

Short-Circuiting and/or

Now we have the standard conditional branching, we need to address logical operators: and and or.

In many programming languages, these operators do not just evaluate to true or false; they return the actual value of one of the operands, and they do so using short-circuit evaluation.

For example, for a and b, if a is falsy, the expresion immediately returns a. It completely ignores b. If a is truthy, it evalutaes and return b. For a or b, if a is truthy, the expression returns a. It ignores b. If a is falsy, it evaluates and return b.

Because they skip over the code, and and or are not actually math or logic operations at the bytecode level, they are control flow.

If you look at other highly-optimized compilers like Lua’s, compiling these operators usually involved building complex linked list of trueJmp and falseJmp.

However, because our VM evaluates truthiness dynamically using registers, our compiler just handles this with an elegant and minimal trick: Register Overwriting.

Compiling and

When the Pratt parser encounters the and token, it calls the infix handleAnd rule. At this point, the left-hand side of the expression has already been parsed, and its ExprDesc state is passed into the function.

Here’s the inplementation in compiler/compiler.c:

1
2
3
4
5
6
7
8
9
10
11
12
13
static void handleAnd(Compiler* compiler, ExprDesc* expr, bool canAssign){
    expr2NextReg(compiler, expr);
    int endJmp = emitJmpIfFalse(compiler, expr->data.loc.index);

    // parse the right-hand side
    ExprDesc right;
    parsePrecedence(compiler, &right, PREC_AND);

    // overwrite the left side register
    expr2Reg(compiler, &right, expr->data.loc.index);
    freeExpr(compiler, &right);
    patchJump(compiler, endJmp);
}

For a and b, if a is true, the jump is ignored, the VM continues execution and evaluate b. The compiler parsed b and forces its result to be stored in register, entirely overwriting the now-useless truthy value of a.

Compiling or

The or operator works using the exact same register-overwriting strategy, just with the inverse jump logic.

1
2
3
4
5
6
7
8
9
10
static void handleOr(Compiler* compiler, ExprDesc* expr, bool canAssign){
    expr2NextReg(compiler, expr);
    // Jump if the left side is true
    int endJmp = emitAsBx(compiler, OP_JMP_IF_TRUE, expr->data.loc.index, 0);
    ExprDesc right;
    parsePrecedence(compiler, &right, PREC_OR);
    expr2Reg(compiler, &right, expr->data.loc.index);
    freeExpr(compiler, &right);
    patchJump(compiler, endJmp);
}

Instead of testing for false, we emit an OP_JMP_IF_TRUE instruction. If the left side is true, the VM short-circuits, jumps to the end, and leaves the truthy left value in the register. If it is false, it falls through, evaluates the right side, overwrites the register, and leaves the right value as the final result.

By integrating this logic directly into our Pratt parsing rules, we can chain massive sequences of logical checks like a and b or c and d without writing a single line of messy AST evaluation code. The expr2Reg function and the jump patcher handle the routing seamlessly on the fly.

Compiling while

A while loop is simply an if statement that possesses the ability to time-travel backward.

Remember the foundational principle: the VM knows nothing about blocks or nesting. It only knows how to move the ip through a flat array of instructions.

To create a while loop, we need to understand two jumps:

  • The Exit Jump: If the condition evaluates to false, we must immediately teleport over the body to exit this loop

  • Then Loop Jump: Once the body finished executing, we must jump backwards to re-evaluate the condition.

Here’s the entire implementation in the compiler/compiler.c:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
static void whileStmt(Compiler* compiler){
    int loopStart = compiler->func->chunk.count;

    if(compiler->loopCnt == LOOP_MAX){
        errorAt(compiler, &compiler->parser.pre, "Too many nested loops.");
        return;
    }

    Loop *loop = &compiler->loops[compiler->loopCnt++];
    loop->start = loopStart;
    loop->scopeDepth = compiler->scopeDepth;
    loop->breakCnt = 0;

    consume(compiler, TOKEN_LEFT_PAREN, "Expect '(' after 'while'.");
    ExprDesc condition;
    expression(compiler, &condition);
    consume(compiler, TOKEN_RIGHT_PAREN, "Expect ')' after condition.");
    expr2NextReg(compiler, &condition);

    int condReg = condition.data.loc.index;

    int exitJmp = -1;

    int codeCnt = compiler->func->chunk.count;
    if(codeCnt >= 3){
        Instruction instFalse = compiler->func->chunk.code[codeCnt - 1];   // loadbool false
        Instruction instTrue = compiler->func->chunk.code[codeCnt - 2];    // loadbool true
        Instruction instCmp = compiler->func->chunk.code[codeCnt - 3];     // jump if false

        if(GET_OPCODE(instFalse) == OP_LOADBOOL && GET_OPCODE(instTrue) == OP_LOADBOOL){
            OpCode cmpOp = GET_OPCODE(instCmp);
            if(cmpOp == OP_LT || cmpOp == OP_LE || cmpOp == OP_EQ){
                compiler->func->chunk.count -= 2;
                freeRegs(compiler, 1);  // free targetReg

                // expect False
                int argB = GET_ARG_B(instCmp);
                int argC = GET_ARG_C(instCmp);
                compiler->func->chunk.code[codeCnt - 3] = CREATE_ABC(cmpOp, 0, argB, argC);

                exitJmp = emitJmp(compiler);
            }
        }
    }

    if(exitJmp == -1){
        exitJmp = emitJmpIfFalse(compiler, condReg);
    }

    freeExpr(compiler, &condition);
    stmt(compiler);

    emitLoop(compiler, loopStart);
    patchJump(compiler, exitJmp);
}

Let’s break down how this is implemented.

Marking the Starting Line

Because of the single-pass design, we need to remember exactly where the loop begins so we can jump back to it later.

1
2
3
4
5
6
7
8
9
10
11
int loopStart = compiler->func->chunk.count;

if(compiler->loopCnt == LOOP_MAX){
    errorAt(compiler, &compiler->parser.pre, "Too many nested loops.");
    return;
}

Loop *loop = &compiler->loops[compiler->loopCnt++];
loop->start = loopStart;
loop->scopeDepth = compiler->scopeDepth;
loop->breakCnt = 0;

Before we look at the condition, we record the current bytecode offset compiler->func->chunk.count; into loopStart.

Notice that we also push a Loop struct onto the compiler->loops array (Loop *loop = &compiler->loops[compiler->loopCnt++];). That is because loops can be nested. When a user writes a break or continue statement deep inside three nested loops, the compiler needs to know exactly which loop it is interacting with. By maintaining a stack of loops, the compiler always knows that the current loop is the one at the top of the stack.

Condition and Exit

Next, we compile the expression inside the parentheses and set up the escape route:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
consume(compiler, TOKEN_LEFT_PAREN, "Expect '(' after 'while'.");
ExprDesc condition;
expression(compiler, &condition);
consume(compiler, TOKEN_RIGHT_PAREN, "Expect ')' after condition.");
expr2NextReg(compiler, &condition);

int condReg = condition.data.loc.index;

int exitJmp = -1;

int codeCnt = compiler->func->chunk.count;
// Peephole optimization
if(codeCnt >= 3){
    Instruction instFalse = compiler->func->chunk.code[codeCnt - 1];   // loadbool false
    Instruction instTrue = compiler->func->chunk.code[codeCnt - 2];    // loadbool true
    Instruction instCmp = compiler->func->chunk.code[codeCnt - 3];     // jump if false

    if(GET_OPCODE(instFalse) == OP_LOADBOOL && GET_OPCODE(instTrue) == OP_LOADBOOL){
        OpCode cmpOp = GET_OPCODE(instCmp);
        if(cmpOp == OP_LT || cmpOp == OP_LE || cmpOp == OP_EQ){
            compiler->func->chunk.count -= 2;
            freeRegs(compiler, 1);  // free targetReg

            // expect False
            int argB = GET_ARG_B(instCmp);
            int argC = GET_ARG_C(instCmp);
            compiler->func->chunk.code[codeCnt - 3] = CREATE_ABC(cmpOp, 0, argB, argC);

            exitJmp = emitJmp(compiler);
        }
    }
}


if(exitJmp == -1){
    exitJmp = emitJmpIfFalse(compiler, condReg);
}

freeExpr(compiler, &condition);

But there is a catch: we don’t know where to jump yet. Because we haven’t compiled the loop body, we have no idea how many instructions it will take up. We cannot provide the destination offset to the jump instruction. We need to use backpatching again.

Once the body is compiled, we emit the unconditional backward jump to force the VM to repeat the process:

1
2
3
4
stmt(compiler);

emitLoop(compiler, loopStart);
patchJump(compiler, exitJmp);

emitLoop emits a standard OP_JMP, but it calculates a negative offset. It substract the loopStart address from the current ip to figure out exactly how far back the VM needs to jump.

1
2
3
4
5
6
7
static void emitLoop(Compiler* compiler, int loopStart){
    int offset = loopStart - compiler->func->chunk.count - 1;
    if(offset < -OFFSET_sBx){
        errorAt(compiler, &compiler->parser.pre, "Loop body too large.");
    }
    emitAsBx(compiler, OP_JMP, 0, offset);
}

Finally, we call patchJump(compiler, exitJmp). Now that the loop body and the backward jump have been written, we finally know the offset. We go back to dummy OP_JMP_IF_FALSE instruction we emitted earlier and patch it with the correct forward offset.

Compiling for

From a high-level perspective, a for loop is just syntactic sugar over a while loop. You could easily rewrite a for loop as a while loop. However, from the perpective of the single-pass compiler, the for loop introduces the mismatch between parsing order and execution order.

When me meet a for statement, we parse the clauses in a left-to-right order:

  • Initializer

  • Condition

  • Increment

  • Body

But during runtime, the VM needs to execute them in this order:

  • Initializer

  • Condition

  • Body

  • Increment

  • Back to Condition

Because we don’t build an AST, we cannot reorder th code in memory. We must emit the instructions exactly as we read it. This means the bytecode for the Increment must sit before the instructions for the Body in the Chunk.

To make the execution jump back and forth across these blocks, we have to weave an intricate web of jumps. Let’s break down how the compiler/compiler.c handles this.

The Scope Shield

Before we even look at the loop mechanics, we must address the initializer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
static void forStmt(Compiler* compiler){
    bool isForeach = false;
    int iterReg = -1;
    int stateReg = -1;

    beginScope(compiler);
    consume(compiler, TOKEN_LEFT_PAREN, "Expect '(' after 'for'.");

    if(match(compiler, TOKEN_SEMICOLON)){
        // No initializer
    }else if(match(compiler, TOKEN_VAR)){
        // parse variable

    }

    // ...

}

Unlike a while loop, a for loop allows you to declare a variable like var i that belongs only to the loop. By calling beginScope, we create an invisible shield around the entire for statement. When the loop finishes, the endScope is called at the very end of the function. The variable is wiped then.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
static void beginScope(Compiler* compiler){
    compiler->scopeDepth++;
}


static void endScope(Compiler* compiler){
    compiler->scopeDepth--;
    while(compiler->localCnt > 0 && 
        compiler->locals[compiler->localCnt-1].depth > compiler->scopeDepth){
            int closedReg = compiler->locals[compiler->localCnt - 1].reg;
            emitABC(compiler, OP_CLOSE_UPVAL, closedReg, 0, 0);
            compiler->localCnt--;
    }

    if(compiler->localCnt > 0){
        compiler->freeReg = compiler->locals[compiler->localCnt - 1].reg + 1;
    }else{
        compiler->freeReg = 1;
    }
}

Condition and Exit

Next, we record the loopStart index and compile the condition:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    // ...
    int loopStart = compiler->func->chunk.count;
    if(compiler->loopCnt == LOOP_MAX){
        errorAt(compiler, &compiler->parser.pre, "Too many nested loops.");
        return;
    }

    Loop *loop = &compiler->loops[compiler->loopCnt++];
    loop->start = loopStart;
    loop->scopeDepth = compiler->scopeDepth;
    loop->breakCnt = 0;

    int exitJmp = -1;
    // ...
    if(!match(compiler, TOKEN_SEMICOLON)){
        ExprDesc condition;
        expression(compiler, &condition);
        consume(compiler, TOKEN_SEMICOLON, "Expect ';' after loop condition.");

        expr2NextReg(compiler, &condition);
        exitJmp = emitJmpIfFalse(compiler, condition.data.loc.index);
        freeExpr(compiler, &condition);
    }

Just like the while loop, we emit an OP_JMP_IF_FALSE (stored in exitJmp). If the condition fails, we will teleport completely out of the loop.

Increment and Body Jumps

We must compile the increment clause, but we must not execute it yet:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int bodyJmp = -1;
    int incStart = -1;

    // if(!isForeach){
        bodyJmp = emitJmp(compiler);
        incStart = compiler->func->chunk.count;
        loop->start = incStart;

        if(!match(compiler, TOKEN_RIGHT_PAREN)){
            ExprDesc incExpr;
            expression(compiler, &incExpr);
            freeExpr(compiler, &incExpr);
            consume(compiler, TOKEN_RIGHT_PAREN, "Expect ')' after loop increment.");
        }

        int loopJmpIndex = emitJmp(compiler);
        int loopOffset = loopStart - loopJmpIndex - 1;
        compiler->func->chunk.code[loopJmpIndex] = CREATE_AsBx(OP_JMP, 0, loopOffset);
        patchJump(compiler, bodyJmp);
    // }

Before compiling the increment, we emit an unconditional OP_JMP. This tells the VM to skip over the increment: bodyJmp = emitJmp(compiler);. We use incStart to record the start of the increment. Notice that we update loop->start = incStart;. This is crucial. If a user writes continue inside the loop body, they don’t want to jump to the Condition, they want to jump to the Increment. We update the loop tracker so it knows where to send continue requests. After the increment executes, the loop needs to check the condition again (int loopJmpIndex = emitJmp(compiler);). Finally, we patch the body jump using patchJump(compiler, bodyJmp);. We tell that first jump we made to land exactly here, right before the actual body begins.

The Body and Loop

Now we just need to compile the body and tie off the ends.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
    stmt(compiler);

    // if(isForeach){
    //     emitLoop(compiler, loopStart);
    // }else{
        int incJmp = emitJmp(compiler);
        int incOffset = incStart - incJmp - 1;
        compiler->func->chunk.code[incJmp] = CREATE_AsBx(OP_JMP, 0, incOffset);
    // }

    
    if(exitJmp != -1){
        patchJump(compiler, exitJmp);
    }

Once the body (stmt(compiler)) finishes, we emit one final backward jump (incJmp) back to the incStart, the beginning of the increment clause. Finally, we patch exitJmp to land at the very end of the loop structure, allowing the condition to securely break the VM out of the loop.

A Quick on foreach

You may notice the commented isForeach checks in the code. If the compiler detects for (var x : list), it skips all this jumping madness. Instead, it relies on a specialized, highly efficient instruction: OP_FOREACH. OP_FOREACH manages the iterator state internally and jumps automatically when the collection is exhausted. We may discuss these inplementations after building the full collection module.

Escaping the Loop

High-level languages provide break (to instantly exit the loop) and continue (to skip the rest of the body and jump to the next iteration). To the virtual machine, these aren’t special keywords, they are just raw OP_JMP instructions. However, compiling them introduces a logistical problem: Context Awareness.

You might recall from the previous sections that whenever we start compiling a loop, we push a Loop struct onto an array called compiler->loops.

This tracks the loop. Whenever the parser encounters a break or continue, it first checks the loop count (if(compiler->loopCnt == 0)). If there are no loops on the loop stack, it throws a syntax error, of cource, you cannot escape a loop you were never in. If it is inside a loop, it grab the top-most loop on the stack (compiler->loops[compiler->loopCnt - 1]) to determine its target.

continue

The continue statement is the simpler of the two. Its job is to abort the current iteration and jump backward to the start of the next one.

1
2
3
4
5
6
7
8
9
10
11
static void continueStmt(Compiler* compiler){
    if(compiler->loopCnt == 0){
        errorAt(compiler, &compiler->parser.pre, "Cannot use 'continue' outside of a loop.");
        return;
    }
    consume(compiler, TOKEN_SEMICOLON, "Expect ';' after 'continue'.");

    Loop *loop = &compiler->loops[compiler->loopCnt - 1];
    closeUpvalues(compiler, loop->scopeDepth);
    emitLoop(compiler, loop->start);
}

In the for loop, we explicitly updated loop->start = incStart; right before compiling the clause. When the compiler processes a continue, it emits a backward jump (emitLoop) pointing directly at loop->start. For a while loop, this jumps to the condition evaluation. For a for loop, it jumps to the increment calculation. The VM doesn’t need to know the difference, it just jumps to the instruction offset we tell it to.

break

While continue jumps backward to a known point, break jumps forward to an unknown location out of the loop.

Because we have not finished compiling the loop body yet, we do not know the offset for the instruction follows the loop. Once again, we need to rely on the backpatching:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static void breakStmt(Compiler* compiler){
    if(compiler->loopCnt == 0){
        errorAt(compiler, &compiler->parser.pre, "Cannot use 'break' outside of a loop.");
        return;
    }
    consume(compiler, TOKEN_SEMICOLON, "Expect ';' after 'break'.");

    Loop *loop = &compiler->loops[compiler->loopCnt - 1];
    closeUpvalues(compiler, loop->scopeDepth);
    if(loop->breakCnt == LOOP_MAX){
        errorAt(compiler, &compiler->parser.pre, "Too many breaks in one loop.");
        return;
    }
    loop->breakJump[loop->breakCnt++] = emitJmp(compiler);
}

A single loop might have dozens of break statements, so the compiler collects all of them. Once the compiler finally finishes compiling the entire loop body (e.g., at the end of the forStmt function), it iterates through this array and patches every single one of those pending jumps to land exactly at the current offset in forStmt():

1
2
3
4
5
// From forStmt()

for(int i = 0; i < loop->breakCnt; i++){
    patchJump(compiler, loop->breakJump[i]);
}

Leaks via Escape Hatches

There’s one line of code exists both in breakStmt and continueStmt that we have not talked about. We haven’t covered functions and closures yet, but jumping out of a block can cause memory leaks if a local variable in that block was captured by a function.

Imagine the code:

1
2
3
4
5
for (var i = 0; i < 5; i = i + 1) {
    var a = "string";
    var myClosure = func() { print a; };
    if (i == 2) break; 
}

When we define myClosure, it captures the local variable a. In a VM, local variables live on the stack. But if we call break, the execution jumps out of the loop, and the scope ends. The local variable a should be theoretically destroyed and popped off the stack.

If we simply jump out of the loop using emitJmp(), myClosure would be left pointing to a dead spot in memory. If it tries to read a later, the VM crashes. We will explore this extensively in the Closures chapter. For now, just know it’s the VM cleaning up before jumping.

Compiling switch

A switch statement is a central routing switchboard.

At the first glance, a switch statement seem like syntactic sugar for multiple if statements. While this is functionally true, compiling a switch is mostly different which introduce a massive web of jumps.

There’s also something special about how this language implements switch. In older languages like C, switch statements suffer a pitfall of implicit fallthrough (if you forget to break, execution bleeds into the next case). Looking at the compiler/compiler.c, we can use a modern, pattern-matching style syntax (e.g., value => statement) that strictly isolates each case. There is no fallthrough here, once a case finishes, it jumps entirely out of the switch block.

The Anchor

Everything in a switch revolves around a single anchor value:

1
2
3
4
5
6
7
8
9
10
static void switchStmt(Compiler* compiler){
    consume(compiler, TOKEN_LEFT_PAREN, "Expect '(' after 'switch'.");
    ExprDesc condExpr;
    expression(compiler, &condExpr);
    consume(compiler, TOKEN_RIGHT_PAREN, "Expect ')' after switch condition.");
    
    expr2NextReg(compiler, &condExpr);
    int condReg = condExpr.data.loc.index;
    // ...
}

We evaluate the condition inside the switch(...) parentheses and pin it to a specific register (condReg). Every single case will be compared against this register.

The Cases

This language allows you to match multiple values for a single body by seperating them with commas (e.g., 1, 2, 3 => print "low";). This syntax requires setting up multiple jumps that all point to the same destination.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int bodyJmps[CASE_MAX];
int bodyJmpCnt = 0;
do{
    ExprDesc caseExpr;
    expression(compiler, &caseExpr);

    expr2NextReg(compiler, &caseExpr);
    emitABC(compiler, OP_EQ, 1, condReg, caseExpr.data.loc.index);
    freeExpr(compiler, &caseExpr);

    bodyJmps[bodyJmpCnt++] = emitJmp(compiler);

    if(bodyJmpCnt == CASE_MAX){
        errorAt(compiler, &compiler->parser.pre, "Too many cases in one switch.");
        return;
    }
}while(match(compiler, TOKEN_COMMA));

nextCaseJmp = emitJmp(compiler);

We emit an OP_EQ to compare it to our anchor condReg. Note the argument 1 passed to OP_EQ, which tells the VM “skip the next instruction if the comparison is false”. We emit an OP_JMP to teleport to the case body, saving its index in bodyJmps.

If none of the commas-separated expressions matched, execution falls through to nextCaseJmp (an unconditional jump that skips the body and lands at the start of the next case block).

Body and Exit

Once the => token is consumed, it is time to compile the actual statement that runs if the case matches.

First, we patch all those bodyJmps we collected earlier so they land exactly here.

1
2
3
4
5
6
7
8
consume(compiler, TOKEN_FAT_ARROW, "Expect '=>' after case expressions.");
for(int i = 0; i < bodyJmpCnt; i++){
    patchJump(compiler, bodyJmps[i]);
}

stmt(compiler);

endJmps[endJmpCnt++] = emitJmp(compiler);

After the body is compiled by stmt(compiler), the case is complete. Because we do not want to support implicit fallthrough, we must guarantee that the execution does not accidentally run the next case’s code.

We maintain an endJmps array. This array acts just like the breakJmp array in the previous code, it collects all the escape routes from every case.

When the entire switch statement finishes compiling (after the closing } brace), the compiler loops through the endJmps array and backpatches every single case to land at the very end of the construct:

1
2
3
for(int i = 0; i < endJmpCnt; i++){
    patchJump(compiler, endJmps[i]);
}

Wrapping Up

At its lowest architectural level, the VM is an unthinking engine marching forward through a flat array of 32-bit instructions. Every complex control flow structure we take for granted in high-level programming is ultimately reduced to just a few primitive jumping commands: OP_JMP, OP_JMP_IF_FALSE, and OP_JMP_IF_TRUE.

Now we have successfully given this language the ability to make decisions, repeat tasks, and route execution. It is now Turing Complete.

This post is licensed under CC BY 4.0 by the author.