Post

Building a Register VM Interpreter - Part 7: Expression Descriptor and Register

This guide is about building a register-based VM interpreter from scratch. We need a mechanism to delay emitting bytecode until we know exactly where the data lives. We need a way to describe an expression's state while it is still in transit.

Building a Register VM Interpreter - Part 7: Expression Descriptor and Register

In the previous part, we have successfully build a Vaughan Pratt’s parsing algorithm. We now have a single-pass parser, transforming a linear stream of tokens into structured, context-aware operations, without the use of AST. But where does the data actually go?

If we were building a traditional stack-based virtual machine (like the JVM or Python’s standard CPython interpreter), the answer would be simple. A stack machine operates on implicit locations, specifically, the top of the stack.

To compiler a a + b in a stack-based VM, the compiler simply emits instructions to push a to the stack, push b to the stack, and then call an add operation. The stack abstracts all of that away.

For a register-based VM, however, we do not have a push-pop mechanism for operations. Instead, the instructions expect targets. As defined in the instruction set, a bytecode is usually shaped to require a target, a left operand, and a right operand.

This architectural choise completely changes the responsibility of the compiler. It can no longer emit OpCodes bindly. When the parser needs to add two things together, the compiler must be able to figure out following questions:

  • Which specific register slot currently hold the left operand.

  • Which specific register slot currently hold the right operand.

  • Which available register slot can hold the result safely without breaking existing data.

In a register VM, the compiler acts as a traffic controller. It has to manage a sliding window of virtual registers, which, at runtime, are simply indexed slots in the current function’s memory frame. It must explicitly allocate temporary registers to hold intermediate calculations (like the result of b * c before it can be added to a), keep track of the register currently in use, and free those temporary registers the exact moment their data is consumed.

To achieve this explicit routing ina single-pass compiler, we need a mechanism to delay emitting bytecode until we know exactly where the data lives. We need a way to describe an expression’s state while it is still in transit.

The Expression Descriptor

Because we are building a single-pass compiler, we do not build an AST. We parse the tokens and emit bytecode immediately. But as we established, in a register machine, we cannot emit a bytecode instruction until we know exactly which registers are involved.

We need a proxy that represents an expression in use. It acts as a holding place, carrying the metadata about a parsed expression until the compiler is finally forced to put it into an actual virtual register. We call this proxy the Expression Descriptor (ExprDesc).

Expr State

Before holding onto an expression’s data, we must define what kind of data it is. In compiler/compiler.h, this is defined by the ExprType enumeration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
typedef enum{
    EXPR_VOID,
    EXPR_NULL,
    EXPR_TRUE,
    EXPR_FALSE,
    EXPR_K,
    EXPR_NUM,
    EXPR_LOCAL,
    EXPR_UPVAL,
    EXPR_GLOBAL,
    EXPR_INDEX,
    EXPR_PROP,
    EXPR_JMP,
    EXPR_TBD,
    EXPR_REG,
    EXPR_CALL,
}ExprType;

When the parser encounters a token, it initializes a descriptor with one of these states:

  • EXPR_NULL, EXPR_TRUE, EXPR_FALSE. These inherent expressions carry no payload. Their type is their value.

  • EXPR_K, EXPR_NUM. These represent static data. EXPR_K means the value lives in the Chunk’s constant pool. EXPR_NUM means it is a raw, numeric literal.

  • EXPR_LOCAL, EXPR_GLOBAL, EXPR_UPVAL. These indicate the value already exists in memory. They are variables.

  • EXPR_INDEX, EXPR_PROP. These represent some complex lookups, which are usually two-step resolutions like array indexing or property access.

  • The ultimate goal EXPR_REG. This means the expression has been fully evaluated and its result is safely sitting in a specific virtual register, ready to go.

For example, if you write a = b, and b is EXPR_LOCAL (already in a register), it just emits a single move instruction to copy the register.

ExprDesc

With the states defined, let’s look at the data structure that actually carries this information through the compiler:

1
2
3
4
5
6
7
8
9
10
11
12
typedef struct{
    ExprType type;
    union{
        struct{
            int index;
            int aux;
        }loc;
        double num;
    }data;
    int tJmp;
    int fJmp;
}ExprDesc;

Keeping this structure small and cache-friendly is crucial. To achieve this, the core payload is wrapped in a C union. Because an expression can only be one thing at a time, it either holds a raw number (double num) or it holds location metadata (struct loc). They safely shares the same memory space.

The location (loc) struct is the core of the descriptor. It contains two integers: index and aux (auxiliary). For a local variable, we only need one piece of information: which register slot it lives in. We store the index and ignore the aux. Consider property access like user.age. To compiler this, the VM needs two things: the register holding the user instance, and the constant pool has the string age indexed. The descriptor elegantly captured this by store the register in index and the property string in aux.

Finally the ExprDesc includes tJmp (true jump) and fJmp (false jump). These are specialized fields reserved for compiling control flow and short-circuiting boolean logic. These act as linked-list heads to track unresolved bytecode offsets. They are essential for compiling short-circuiting boolean logic. For example, in an expression like a and b, if a evaluates to false, the VM must immediately jump over b. These fields store that incomplete jump destination so the compiler can patch it once it figures out where the end of the expression actually is.

Deferred Execution

The underlaying principle of ExprDesc is deferred execution.

When the Pratt parser calls expression(&expr), it passes a pointer to an empty ExprDesc. As the parser consumes tokens, it doesn’t immediately spit out OP_LOADK or OP_ADD. Instead, it mutates the ExprDesc, filling in the type, the index, and the aux. It’s only when the expression encounters an operation that requires a register, such as being the left-hand side of the binary +, or being passed as an argument, that the compiler looks at the ExprDesc and use it.The compiler then calls a function like expr2Reg, which finally descharges the descriptor, emits the bytecodes, and upgrade the type to EXPR_REG.

Registers and Allocation Lifecycle

Unlike a physical CPU where registers are fixed hardware circuits (RAX, RSP, R0 …), the registers here are completely virtual. At runtime, a virtual register is simply an index into an array of Values allocated on the VM’s stack. When a function is called, it gets a window of the stack called a Call Frame. Register 0 is the first slot in this window, Register 1 is the second, and so on. The compiler cannot just assign a new register for every single operation. It must aggressively reuse them. It needs a lifecycle: allocate, use, and immediately free.

Register Window

To manage this lifecycle, our Compiler struct maintains a few critical fields:

1
2
3
4
5
6
7
8
9
typedef struct Compiler{
    // ...
    Local locals[LOCAL_MAX];
    int localCnt;
    // ...
    int freeReg;
    int maxRegSlots;
    // ...
}Compiler;

freeReg is the core allocator. It points to the next available register index in the current function’s stack frame. maxRegSlots tracks the maximum number of the register usage for the function. Temporary registers are constantly allocated and freed, freeReg bounces up and down. maxRegSlots remembers the absolute maximum number of registers needed at any one time, so the VM knows exactly how much stack space to reserve when the function is called.

When the compiler needs a place to put temporary data, it asks for the next available slot and then reserves it.

In compiler/compiler.c

1
2
3
static int getFreeReg(Compiler* compiler){
    return compiler->freeReg;
}

and reserve the register:

1
2
3
4
5
6
7
8
9
10
11
static void reserveReg(Compiler* compiler, int cnt){
    compiler->freeReg += cnt;

    if(compiler->freeReg > compiler->maxRegSlots){
        compiler->maxRegSlots = compiler->freeReg;
    }

    if(compiler->freeReg > REG_MAX){
        errorAt(compiler, &compiler->parser.pre, "Register overflow.");
    }
}

Notice that retrieving the free register and reserving it are decoupled. Often, the compiler needs to know where data is going to go (to patch an instruction or set up an ExprDesc) before it actually commits to allocating the space.

Local Anchor

Temporary registers hold the intermediate math, like the result of an addition. As soon as that result is used, the temporary register is useless. It should be freed so the next operation can reuse the slot.

So we need a freeRegs function:

1
2
3
4
5
6
7
8
9
static void freeRegs(Compiler* compiler, int cnt){
    compiler->freeReg -= cnt;
    int minReg = compiler->localCnt > 0 \
        ? compiler->locals[compiler->localCnt - 1].reg + 1 \
        : 1;
    if(compiler->freeReg < minReg){
        compiler->freeReg = minReg;
    }
}

In this compiler, a local variable are registers that never freed until they go out of the scope. If you declare var x = 5; and var y = 10;, x might live in Register 1 and y in Register 2. If we then compute x + y, the compiler allocates Register 3 for the intermediate result. When the compiler frees Register 3, it simply decrements freeReg. The minReg acts as a boundary. The compiler looks at the currently active local variables (compiler->locals[compiler->localCnt - 1]). It finds the register assigned to the most recently declared variable, and sets a floor. Temporary math can bounce freeReg up and down all day, but it can never drop below the anchor set by the local variables.

Discharge the Descriptor

Now we have the ExprDesc holding, and the allocator managing the physical slots. The moment of truth happens in expr2Reg. This is the function that forces an expression to finally land in a concrete register, emitting the necessary bytecode to make it happen.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
static void expr2Reg(Compiler* compiler, ExprDesc* expr, int reg){
    unplugExpr(compiler, expr);
    switch(expr->type){
        case EXPR_VOID:
            errorAt(compiler, &compiler->parser.pre, "Expression has no value.");
            break;
        case EXPR_NULL:
            emitABC(compiler, OP_LOADNULL, reg, 0, 0);
            break;
        case EXPR_TRUE:
            emitABC(compiler, OP_LOADBOOL, reg, 1, 0);
            break;
        case EXPR_FALSE:
            emitABC(compiler, OP_LOADBOOL, reg, 0, 0);
            break;
        case EXPR_K:
            emitABx(compiler, OP_LOADK, reg, expr->data.loc.index);
            break;
        case EXPR_NUM:
            // optimize later
            emitABx(compiler, OP_LOADK, reg, expr->data.loc.index);
            break;
        case EXPR_TBD:{
            Instruction instruction = compiler->func->chunk.code[expr->data.loc.index];
            instruction = 
                (instruction & ~((Instruction)MASK_A << POS_A)) 
                | ((Instruction)(reg & MASK_A) << POS_A);
            compiler->func->chunk.code[expr->data.loc.index] = instruction;
            break;
        }
        case EXPR_LOCAL:
        case EXPR_REG:{
            if(reg != expr->data.loc.index){
                emitABC(compiler, OP_MOVE, reg, expr->data.loc.index, 0);
            }
            break;
        }
        default:
            // Should not reach here
            break;
    }
    expr->type = EXPR_REG;
    expr->data.loc.index = reg;
}

Suppose the Pratt parser encounters the literal true. It creates an ExprDesc of type EXPR_TRUE and does nothing else.

Later, the compiler realizes it needs this boolean in a register. It allocates a register and calls exprReg(&expr, 4).

The switch statement inspects the descriptor, sees EXPR_TRUE, and emits the OP_LOADBOOL instruction targeting Register 4. The last two lines of code overwrites epxr->type to EXPR_REG and saves reg (4) as the index. If we later call expr2Reg(&expr, 5) on that same descriptor, the switch statement falls into the EXPR_REG case. Since the data is already in Register 4, it just emits an OP_MOVE instruction to copy it to Register 5.

If you look at the very first line of expr2Reg (and expr2NextReg), you will notice a call to unplugExpr(compiler, expr);

In this architecture, “unplugging” is the process of converting a multi-step environment lookup into a concrete, usable value in the execution frame.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
static void unplugExpr(Compiler* compiler, ExprDesc* expr){
    switch(expr->type){
        case EXPR_VOID:
            break;
        /*
        case EXPR_UPVAL:
            expr->data.loc.index = emitABC(
                compiler, 
                OP_GET_UPVAL, 
                0, 
                expr->data.loc.index, 
                0
            );
            expr->type = EXPR_TBD;
            break;
        case EXPR_GLOBAL:
            expr->data.loc.index = emitABx(
                compiler, 
                OP_GET_GLOBAL, 
                0, 
                expr->data.loc.index
            );
            expr->type = EXPR_TBD;
            break;
        case EXPR_INDEX:
        {
            int destReg = getFreeReg(compiler);
            reserveReg(compiler, 1);

            emitABC(
                compiler, 
                OP_GET_INDEX, 
                destReg, 
                expr->data.loc.index, 
                expr->data.loc.aux
            );
            expr->type = EXPR_REG;
            expr->data.loc.index = destReg;
            break;
        }
        case EXPR_PROP:
        {
            int destReg = getFreeReg(compiler);
            reserveReg(compiler, 1);

            emitABC(
                compiler, 
                OP_GET_PROPERTY, 
                destReg, 
                expr->data.loc.index, 
                expr->data.loc.aux
            );
            expr->type = EXPR_REG;
            expr->data.loc.index = destReg;
            break;
        }
        case EXPR_CALL:
            expr->type = EXPR_TBD;
            break;
        default:
            break;
        */
    }
}

unplugExpr acts as the bridge. If you pass an EXPR_GLOBAL descriptor to unplugExpr, it emits the actual OP_GET_GLOBAL instruction to fetch the value from the global hash table, and then downgrades the descriptor’s state from EXPR_GLOBAL to EXPR_TBD (To Be Determined). It “unplugs” the variable from the global environment into the VM’s active execution pipeline.

To streamline the common pattern of evaluating the expression into a brand new temporary register, we can define a helper function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
static void expr2NextReg(Compiler* compiler, ExprDesc* expr){
    unplugExpr(compiler, expr);
    freeExpr(compiler, expr);
    reserveReg(compiler, 1);
    expr2Reg(compiler, expr, compiler->freeReg - 1);
}

static void freeExpr(Compiler* compiler, ExprDesc* expr){
    if(expr->type == EXPR_TBD){
        int minReg = compiler->localCnt > 0 ? compiler->locals[compiler->localCnt - 1].reg + 1 : 1;
        if(expr->data.loc.index >= minReg){
            if(expr->data.loc.index < compiler->freeReg){
                freeRegs(compiler, 1);
            }
        }
    }
}

This helper is the bread and butter of our compiler backend. It takes an expression, grabs the next available register off the top of the stack frame, and discharges the expression directly into it.

unplugExpr is called at the very beginning of expr2NextReg, even though it gets called again inside expr2Reg. For simple values like EXPR_TRUE or EXPR_LOCAL, unplugExpr does nothing. But for complex values like EXPR_PROP (which will be discussed later), unplugExpr must allocate a new temporary register and emit an instruction to fetch the property into it. If an unplug function is not called first in expr2NextReg, it emits wasteful instructions.

Emitting Register

We now have all the individual pieces on the board. We have Vaughan Pratt’s parser to understand the precedence of tokens, the ExprDesc to hold state in flight, and the virtual register allocator to manage memory slots.

Let’s put it together by tracing step-by-step through compiling a simple expression: x + 1.

When the parser encounters x, it resolves it as a variable. Let’s assume x is a local variable currently residing in Register 1. The compiler creates an ExprDesc for the left side of the operation: EXPR_LOCAL with an index of 1.

Next, the parser encounters the + token. According to our Pratt parsing rules, this triggers handleBinary.

The handleBinary will parse the right side of the expression and consumes the 1. It returns a second ExprDesc, this time, an EXPR_K pointing to 1 in the constant pool.

At this exact moment, no bytecode has been emitted for the addition. We just have two descriptors floating in C memory:

  • EXPR_LOCAL (Register 1)

  • EXPR_K (Constant Index for 1)

The handleBinary then calls emitBinaryOp to finally combine them.

The CPU’s Arithmetic Logic Unit (ALU) cannot add a register to a constant pool index directly, it requires both operands to be loaded into the registers. Therefore, emitBInaryOp must first force both the two descriptors to materialize:

1
2
3
4
5
6
7
8
9
10
11
static void emitBinaryOp(Compiler* compiler, OpCode op, ExprDesc* left, ExprDesc* right){
    expr2NextReg(compiler, left);
    expr2NextReg(compiler, right);

    freeExpr(compiler, right);
    freeExpr(compiler, left);

    int instructionIndex = emitABC(compiler, op, 0, left->data.loc.index, right->data.loc.index);
    left->type = EXPR_TBD;
    left->data.loc.index = instructionIndex;
}

Let’s take a look at the first two lines of code in this function:

  • expr2NextReg(&left): The left descriptor is already EXPR_LOCAL (Register 1). Since it’s already in a register, expr2Reg does virtually nothing. It simply updates the descriptor type to EXPR_REG and leaves it in Register 1.

  • expr2NextReg(&right): The right descriptor is EXPR_K. To do math, this constant needs to be loaded. The compiler asks for the next available slot and emits OP_LOADK 2, <index>. The right descriptor is now updated to EXPR_REG living in Register 2.

Before we emit the actual addition code, we must manage the memory (free):

1
2
freeExpr(compiler, right);
freeExpr(compiler, left);

The compiler looks at the right expression (Register 2). Because Register 2 is a temporary slot, freeExpr immediately releases it. It then looks at the left expression (Register 1). Because Register 1 belongs to the local variable x, the minReg safety boundary prevents it from being freed.

By freeing the temporary right register before emitting the final result, the compiler guarantees that the destination of x + 1 can perfectly overwrite and reuse Register 2.

Delayed Targeting

1
2
3
4
5
6
7
8
9
int instructionIndex = emitABC(
                        compiler, 
                        op, 
                        0, 
                        left->data.loc.index, 
                        right->data.loc.index
                    );
left->type = EXPR_TBD;
left->data.loc.index = instructionIndex;

Look closely at the emitABC call. The format of this instruction is OP_ADD dest, left, right. But the compiler passes 0 as the destination register. It emits OP_ADD 0, 1, 2.

Because, at this moment, the compiler does not know where the result of x + 1 is supposed to go. If the compiler arbitrarily chose a temporary register right now, and the user wrote y = x + 1, it would have to emit an additional, wasteful OP_MOVE y_reg, tmp_reg instruction later.

Instead, the compiler has a EXPR_TBD state. It writes the incomplete instruction to the chunk, saves the exact index of the instruction.

Later, when the outer parser finally resolves the assignment (e.g., y = x + 1) and calls expr2Reg(&expr, y_reg), the switch statement in expr2Reg intercepts the EXPR_TBD state:

1
2
3
4
5
6
7
8
9
10
    // ...
case EXPR_TBD:{
    Instruction instruction = compiler->func->chunk.code[expr->data.loc.index];
    instruction = 
        (instruction & ~((Instruction)MASK_A << POS_A)) 
        | ((Instruction)(reg & MASK_A) << POS_A);
    compiler->func->chunk.code[expr->data.loc.index] = instruction;
    break;
}
    // ...

It reaches back into the compiled bytecode array, get the incomplete instruction, and then patches the destination operand with y_reg, and saves it back finally.

Wrapping Up

Now, the compiler is a fully functioning, meticulous traffic controller.

By designing the Expression Descriptor (ExprDesc), we solve the fundamental problem of a single-pass register compiler: the need of delayed execution. Instead of immediately emitting naive push/pop instructions like a stack machine, or building a massive AST in memory, we capture the promise of a value in a tiny struct. We juggle these descriptors in the call stack, carrying them through the parsing functions until the exact moment an operation demands a physical register.

By designing a Virtual Register Allocator, we use a sliding window of memory for the functions.

By using tricks like EXPR_TBD state, we allow the compiler to emit incomplete instructions and patch them later. We don’t waste time moving data around pointlessly; we calculate the destination, reach back into the bytecode array, and wire the math directly to where it needs to go.

This post is licensed under CC BY 4.0 by the author.