Building a Register VM Interpreter - Part 8: The VM Core
This guide is about building a register-based VM interpreter from scratch. We will design the memory structures required to run a program, build the dispatch loop to fetch instructions, and write the logic to finally execute the math and logic operations we worked so hard to compile.
Over the past few articles, we have spent all the time building the front-end and middle-end of the languange. We built a scanner to tokenize source code, designed a Pratt parser to understand precendence without an AST, and constructed a meticulous register allocator to route data.
At the end of this pipeline, the compiler’s job is complete. Its output is simple: a Chunk. But a Chunk is just static data in memory. It cannot do anything on its own. To bring this data to life, we need a runtime environment. We need a Virtual Machine (VM). In the architecture of our language, the VM acts as a software-simulated CPU. Just like a physical processor (like an x86 or ARM chip) fetches machine code from RAM, decodes the bits, and executes the operations on its hardware registers, our Virtual Machine will fetch our 32-bit instructions from the Chunk, decode the operands, and execute them on our virtual registers.
We will design the memory structures required to run a program, build the dispatch loop to fetch instructions, and write the logic to finally execute the math and logic operations we worked so hard to compile.
The VM State
In a compiled language like C, the CPU maintains its execution state using physical hardware registers and an implicit call stack managed by the operating system. Because we are building a VM, we don’t have physical hardware to operate. We must simulate this entire environment in memory.
FIrst we need a central, global data structure to represent the entire running state of the VM.
We define the VM struct in vm/vm.h:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#define FRAMES_MAX 64
#define STACK_MAX (FRAMES_MAX * 256)
typedef struct VM{
Value stack[STACK_MAX]; // Stack for values
Value* stackTop; // for alloc new CallFrame
/*
HashTable strings;
HashTable globals;
HashTable* curGlobal;
HashTable* globalStack[GLOBAL_STATCK_MAX];
int globalCnt;
HashTable modules;
Object* objects;
ObjectString* initString;
ObjectUpvalue* openUpvalues; // descending locations
*/
CallFrame frames[FRAMES_MAX];
int frameCount;
/*
size_t bytesAllocated;
size_t nextGC;
Compiler* compiler;
uint64_t hash_seed;
*/
}VM;
We pack everything into a struct VM instead of declaring them as a bunch of global variables. This is for encapsulation and re-entrancy. By passing the VM* vm pointer around runtime function, this languange can be isolated.
The Register-Based Stack
We built a register-based architecture to avoid stack-based pushing and popping. But, actually register-based VMs still use a stack.
The difference lies in how the stack is accessed. In a stack VM, instructions operate implicitly on the top of the stack. In a register machine, the stack is treated as one massive, flat array of memory, in our case, Value stack[STACK_MAX].
The “virtual registers” we meticulously tracked and allocated in the compiler are not isolated variables. They are integer indices into this stack array.
To manage this contiguous block of memory, we track it using the stackTop pointer.
Let’s look at how the VM initializes this memory in vm/vm.c:
1
2
3
4
5
6
7
8
void resetStack(VM* vm){
vm->stackTop = vm->stack;
}
void initVM(VM* vm, int argc, const char* argv[]){
resetStack(vm);
// ...
}
When the VM spins up, the resetStack simply points stackTop to the very first element of the stack array (vm->stack[0]). It is worth mentioning here that the stackTop always points to the next available empty slot, not the current occupied top value.
Push, Pop and Peek
Even though our core math and logic operations (like OP_ADD) will execute by directly indexing into the registers, traditional stack operations are still vital for the VM’s infrastructure.
We define them in vm/vm.c:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void push(VM* vm, Value value){
if(vm->stackTop - vm->stack >= STACK_MAX){
runtimeError(vm, "Stack overflow");
exit(EXIT_FAILURE);
}
*vm->stackTop++ = value;
}
Value pop(VM* vm){
vm->stackTop--;
return *vm->stackTop;
}
Value peek(VM* vm, int distance){
if(vm->stackTop - vm->stack - 1 - distance < 0){
runtimeError(vm, "Stack underflow");
exit(EXIT_FAILURE);
}
return vm->stackTop[-1 - distance];
}
-
push: We first have a boundary check. If it exceedsSTACK_MAX, the VM fatally crashes with a Stack Overflow. Otherwise, we write thevalueto the slotstackTopis pointing to, and then increment the pointer (++) to point to the next empty slot. -
pop: Because thestackToppoints to an empty slot, we must decrement it first. Then we return the value at the newly decremented address. -
peek: This function allow the VM to look down the stack without consuming a value, it is quite useful sometime. Adistanceof0peeks at the top level,1looks at the value below that, and so on.
While variables and math operations live in indexed registers, the VM relies heavily on the stackTop pointer for orchestration. For example:
-
When evaluating arguments to call a function, the caller will temporally
pushthem to the top of the stack to prepare the new execution environment. -
When the Garbage Collector (GC) runs, it looks at everything from
vm->stackup tovm->stackTopto know which values are currently “alive” and shouldn’t be deleted.
By keeping stackTop accurate via push and pop, the VM always knows exactly where the boundary between “active memory” and “garbage memory” lies.
Call Frames
If a script was just one long sequence of operations running in a single global scope, a single stack array would be enough. But real programming languages have functions. Functions call other functions, and sometimes they even call themselves (recursion).
To solve this, the virtual machine creates a sliding window over the global stack every time a function is called. In compiler terminology, this window is known as a Call Frame.
Let’s look at vm/vm.h:
1
2
3
4
5
6
7
8
9
10
11
12
#define GLOBAL_STATCK_MAX 64
#define MAX_DEFERS 255
typedef struct CallFrame{
ObjectClosure* closure;
Instruction* ip;
Value* base;
/*
ObjectClosure* defers[MAX_DEFERS];
int deferCnt;
*/
}CallFrame;
This CallFrame is the main context wrapper in the VM. Whenever the VM executes bytecodes, it is always looking at the current CallFrame. Let’s break down the three most critical fields and why they exist.
-
closure: This is a pointer to the function object (specifically a closure, which we will implement fully in the OOP/Functions article) currently being executed. The VM needs this because the closure holds theObjectFunc, which in turn holds theChunkcontaining the bytecode and the constant poolK. -
ip(Instruction Pointer): This is a direct C pointer pointing to the bytecode instruction currently being executed inside the function’s chunk. -
base: This is the magic that makes the register allocator actually work. It is a pointer to a specific starting slot in the globalstackarray.
Writing Registers to Memory
In previous part, the compiler can emit instructions like OP_ADD 0 1 2. At runtime, the VM doesn’t look at the absolute bottom of the global stack. It looks at the base pointer of the current CallFrame.
In vm/vm.c, we define a macro that helps us access the position:
1
#define R(n) (frame->base[(n)])
If main() has its base pointing to stack[0], then main()’s Register 1 is stack[1].
If main calls another function cal(), the VM pushes a new CallFrame. It sets the new frame’s base pointer to an offset further down, past the variables main is using. To cal, its register 0 is accessed via R(0).
The two functions are safely isolated, yet they are both operating on the exact same underlying C array.
The Frame Array
To manage the overlapping windows, we define a frame array in the VM state:
1
2
3
4
5
6
typedef struct VM{
// ...
CallFrame frames[FRAMES_MAX];
int frameCount;
// ...
}VM;
The frames acts as the call stack and the frameCount tracks how deep we currently are.
When a script begins, the interpreter wraps the top-level code in an implicit <script> function, pushes the very first CallFrame into frames[0] (where base points to stack[0]), increments frameCount to 1, and begins reading instructions. If frameCount ever exceeds FRAMES_MAX (64 in this case), it means the script has recursed too deeply, and the VM will safely halt with a Stack Overflow error.
The Dispatch Loop
With the memory structured and the CallFrame actively framing the stack, it is now ready to build the working engine inside the VM. This happens in the run() function.
The run() has these jobs to do: fetch the next instruction, decode it, execute the logic, and then reapeat. This loop continues until it hits an OP_RETURN that empties the call stack, or a fatal runtime error occurs of course.
If you have ever read a tutorial on writing a simple emulator or VM, you may see the dispatch loop implemented as a massive while(true) loop containing a giant switch statement:
1
2
3
4
5
6
7
8
while(true){
Instruction instruction = *frame->ip++;
switch(GET_OPCODE(instruction)){
case OP_ADD: /* do add */ break;
case OP_SUB: /* do sub */ break;
// ... more cases ...
}
}
This works perfectly, but it has a performance flaw. In a massive switch statement, the C compiler usually generates a jump table. Every time the loop restarts, the CPU has to do a bounds check on the switch value, look up the jump address in the table, and then jump. Worse, modern CPUs rely heavily on branch prediction to stay fast. Modern CPUs use pipelines, a process similar to an assembly line where multiple instructions like fetching, decoding and executing are processed in different stage at the same time. When the CPU hits a branch, it doesn’t know which instruction to fetch next until the current one finishes “Executing”. Instead of waiting (which causes a pipeline stall), the CPU guesses which way the code will go and starts speculative execution on that path. If the guess is wrong (a misprediction), the CPU must “flush” the pipeline, discarding all speculatively finished work. On modern chips, this can waste 10–20 clock cycles, significantly slowing down performance. In this case, every single instruction funnels back through the exact same statement header, which makes the CPUs Branch Predictor Unit (BPU) get confused, it provides no clear pattern for the hardware to learn. It cannot accurately guess which instruction is coming next, leading to frequent, expensive pipeline flushes.
To solve this problem, there are many code optimization techniques. For example, an if statement can be replaced with bitwise operations. For the switch statement, we can implement with a look-up table. Instead of “branching” to a case, the CPU simply calculates an address and jumps directly to it, which can be easier for modern Indirect Branch Predictors to handle. In this project, we are going to use a GCC-specific compiler extension known as Labels as Values to implement a technique called Computed Gotos.
Computed Gotos
Instead of a switch statement, we define an array of pointers. But not just any pointers, these are pointers to C goto labels.
Let’s look at the setup at the top of run() in vm/vm.c:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
static InterpreterStatus run(VM* vm){
CallFrame* frame = &vm->frames[vm->frameCount - 1];
Instruction instruction;
static void* dispatchTable[] = {
[OP_MOVE] = &&DO_OP_MOVE,
[OP_LOADK] = &&DO_OP_LOADK,
[OP_LOADNULL] = &&DO_OP_LOADNULL,
[OP_LOADBOOL] = &&DO_OP_LOADBOOL,
[OP_GET_GLOBAL] = &&DO_OP_GET_GLOBAL,
[OP_SET_GLOBAL] = &&DO_OP_SET_GLOBAL,
[OP_GET_UPVAL] = &&DO_OP_GET_UPVAL,
[OP_SET_UPVAL] = &&DO_OP_SET_UPVAL,
[OP_GET_INDEX] = &&DO_OP_GET_INDEX,
[OP_SET_INDEX] = &&DO_OP_SET_INDEX,
[OP_GET_PROPERTY] = &&DO_OP_GET_PROPERTY,
[OP_SET_PROPERTY] = &&DO_OP_SET_PROPERTY,
[OP_ADD] = &&DO_OP_ADD,
[OP_SUB] = &&DO_OP_SUB,
[OP_MUL] = &&DO_OP_MUL,
[OP_DIV] = &&DO_OP_DIV,
[OP_MOD] = &&DO_OP_MOD,
[OP_NEG] = &&DO_OP_NEG,
[OP_NOT] = &&DO_OP_NOT,
[OP_EQ] = &&DO_OP_EQ,
[OP_LT] = &&DO_OP_LT,
[OP_LE] = &&DO_OP_LE,
[OP_JMP] = &&DO_OP_JMP,
[OP_JMP_IF_FALSE] = &&DO_OP_JMP_IF_FALSE,
[OP_JMP_IF_TRUE] = &&DO_OP_JMP_IF_TRUE,
[OP_RETURN] = &&DO_OP_RETURN,
[OP_CLOSURE] = &&DO_OP_CLOSURE,
[OP_CLOSE_UPVAL] = &&DO_OP_CLOSE_UPVAL,
[OP_CLASS] = &&DO_OP_CLASS,
[OP_METHOD] = &&DO_OP_METHOD,
[OP_FIELD] = &&DO_OP_FIELD,
[OP_PRINT] = &&DO_OP_PRINT,
[OP_DEFER] = &&DO_OP_DEFER,
[OP_SYSTEM] = &&DO_OP_SYSTEM,
[OP_TO_STRING] = &&DO_OP_TO_STRING,
[OP_JMP] = &&DO_OP_JMP,
[OP_CALL] = &&DO_OP_CALL,
[OP_IMPORT] = &&DO_OP_IMPORT,
[OP_BUILD_LIST] = &&DO_OP_BUILD_LIST,
[OP_INIT_LIST] = &&DO_OP_INIT_LIST,
[OP_FILL_LIST] = &&DO_OP_FILL_LIST,
[OP_BUILD_MAP] = &&DO_OP_BUILD_MAP,
[OP_SLICE] = &&DO_OP_SLICE,
[OP_FOREACH] = &&DO_OP_FOREACH,
};
// ...
}
The double-ampersand (&&) is the GCC extension. It gets the memory address of a label defined further down in the function. We use C99 designated initializers (just like we did in our Pratt parser rule table) to map every OpCode enumeration directly to the memory address of the C code that handles it.
The DISPATCH() Macro
With the table of label addresses ready, we don’t need a while loop or a switch. We can literally command the CPU to go to the address associated with this OpCode.
We define the following macro to achieve this action:
1
2
3
4
5
#define DISPATCH() \
do { \
instruction = *frame->ip++; \
goto *dispatchTable[GET_OPCODE(instruction)]; \
} while (0)
What the DISPATCH() does:
-
It fetches the 32-bit
Instructionat the currentip. -
It increments the
ipto point to the next instruction. -
It extracts the
OpCodebits, looks up the corresponding label address indispatchTable, and jumps directly to it usinggoto *dispatchTable[].
In the codebase, we can wrap this in an #ifdef DEBUG_TRACE to inject disassembly printing when debugging is turned on, which is better for tracing execution.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#ifdef DEBUG_TRACE
#define DISPATCH() \
do { \
instruction = *frame->ip++; \
dasmInstruction(&frame->closure->func->chunk, \
(int)(frame->ip - frame->closure->func->chunk.code - 1)); \
goto *dispatchTable[GET_OPCODE(instruction)]; \
} while (0)
#else
#define DISPATCH() \
do { \
instruction = *frame->ip++; \
goto *dispatchTable[GET_OPCODE(instruction)]; \
} while (0)
#endif
The Threaded Handler
Because we are using goto, our execution engine is no longer a set of case blocks. It is a sequence of labels.
To kickstart the VM, we simple call the DISPATCH() macro, once to load the first instruction and jump into the web of labels. From that point on, every label block ends by calling DISPATCH() again.
Here is an expamle:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// ...
static void* dispatchTable[] = {
[OP_MOVE] = &&DO_OP_MOVE,
};
// ...
DISPATCH();
DO_OP_MOVE:
{
R(GET_ARG_A(instruction)) = R(GET_ARG_B(instruction));
} DISPATCH();
Once the operation is complete, it hits DISPATCH(). The macro fetches the next instruction, decodes it, and goto jumps directly to the next block.
In a switch, every instruction jumps to the end of the block, then jumps back up to the while header, evaluates the next switch, and jumps down to the next case.
With a computed goto, the dispatch happens at the end of every individual handler, the CPU’s branch predictor tracks the jump history of each specific instruction independently. For example, it learns that OP_LT is often followed by OP_JMP_IF_FALSE, and optimizes the pipeline accordingly.
Writing Arguments to Memory
We rely on a set of extraction macros (defined in instruction.h headers) to pull the exact integers we need:
1
2
3
4
5
6
7
// Getters
#define GET_OPCODE(i) ((OpCode)((i >> POS_OP) & MASK_OP))
#define GET_ARG_A(i) ((int)((i >> POS_A) & MASK_A))
#define GET_ARG_B(i) ((int)((i >> POS_B) & MASK_B))
#define GET_ARG_C(i) ((int)((i >> POS_C) & MASK_C))
#define GET_ARG_Bx(i) ((int)((i >> POS_BX) & MASK_BX))
#define GET_ARG_sBx(i) (GET_ARG_Bx(i) - OFFSET_sBx)
Extracting the integers from an instruction is only the half battle. The VM needs to know what thos numbers actually represent.
As we discussed before, these integers are indices into the current CallFrame’s register window. We use the R(n) macro to resolve them into actual memory addresses. A function’s bytecode often references static data, like a string literal "Hello" or a large number like 3.14159. These values don’t live in registers; they live in the Chunk’s constant pool (constants.values array). To make accessing this constant pool just as seamless as accessing registers, we define a second crucial macro at the top of the vm/vm.c:
1
#define K(n) (frame->closure->func->chunk.constants.values[(n)])
Let’s break down this long macro: it look at the current window (frame) and grab the active function from closure (->closure->func). The constant pool attatched to this function will be accessed (->chunk.constants.values[(n)]), just pass the index into it.
By combining the extraction macros, our register mapping R(n), and our constant pool mapping K(n), we can write incredibly expressive execution blocks.
Consider the OP_LOADK instruction, which the compiler emits when it needs to load a static constant into a register. Here is how we write its entire handler:
1
2
3
4
DO_OP_LOADK:
{
R(GET_ARG_A(instruction)) = K(GET_ARG_Bx(instruction));
} DISPATCH();
This single line extract the 18-bit Bx argument, use it to fetch a Value into from the constant pool, extract the 8-bit A argument, and copy that into the slot in the current frame’s stack memory.
Wiring Simple OpCodes
We now have a fast dispatch loop and the macros necessary to decode operands and map them to memory. The final step of building the core VM is actually writing the operational logic for goto labels.
Loading Primitive Values
We use OP_LOADK to fetch a large string or numbers from the constant pool. But for simple primitibe values, like frequently used null, true, and false, jumping out to the constant pool is a waste of memory and cache.
Instead, the compiler packs the primitive data directly into the operands of the instruction itself:
1
2
3
4
5
DO_OP_LOADBOOL:
{
R(GET_ARG_A(instruction)) = BOOL_VAL(GET_ARG_B(instruction));
if(GET_ARG_C(instruction)) frame->ip++; // skip next instruction if C != 0
} DISPATCH();
Here, the B operand isn’t a register index, it’s the actual boolean value (true or false). The C macro BOOL_VAL() wraps it into our dynamic Value struct, and we drop it directly into Register A. If C is non-zero, the VM physically skips the next instruction in the bytecode array. This is a highly specialized optimization often used in compiling short-circuiting boolean logic (like a and b), allowing the compiler to pack a load and a jump into a single instruction cycle.
In many virtual machines, if you compile an expression like x = a < b, the < operator evaluates the math and directly spits true or false into the destination register x. But if you look closely at how the VMs like Lua VM handles comparison operators like OP_LT (Less Than), it does not actually store a result anywhere. Instead, it acts as a Test and Skip:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
DO_OP_LT:
{
Value b = R(GET_ARG_B(instruction));
Value c = R(GET_ARG_C(instruction));
int expect = GET_ARG_A(instruction);
if(!IS_NUM(b) || !IS_NUM(c)){
runtimeError(vm, "Operands must be numbers.");
return VM_RUNTIME_ERROR;
}
if((AS_NUM(b) < AS_NUM(c)) != expect){
frame->ip++;
}
} DISPATCH();
Instead of saving a boolean, OP_LT evaluates the condition. If the condition fails to match our expectation, it increments the Instruction Pointer (ip++), physically skipping whatever bytecode instruction comes next.
If the OP_LOADBOOL did not have a feature of skipping, compiling x = a < b would require the compiler to emit a lot more jumps to route the logic.
-
OP_LT 1, A, B: Expect true. If A < B is false, skip next instruction (Jump 2 slots) -
Jump 2 slots
-
OP_LOADBOOL X, 0: Load false into X -
Jump 1 slot
-
OP_LOADBOOL X, 1: Loead true into X
Because OP_LOADBOOL can optionally act as a jump (by passing 1 to the C argument), we completely eliminate the need for OP_JMP instructions. The compiler can pack the evaluation of x = a < b into just three tightly bound instructions:
-
OP_LT 1, A, B: Expect true. If A < B is false, skip next instruction (OP_LOADBOOL X, 1, 1`). -
OP_LOADBOOL X, 1, 1: Load true into X… and skip next instruction (OP_LOADBOOL X, 0, 0). -
OP_LOADBOOL X, 0, 0: Load false into X.
Unlike OP_LOADBOOL, loading a null is more straightforward:
1
2
3
4
5
6
7
8
DO_OP_LOADNULL:
{
int a = GET_ARG_A(instruction);
int b = GET_ARG_B(instruction);
for(int i = 0; i <= b; i++){
R(a + i) = NULL_VAL;
}
} DISPATCH();
OP_LOADNULL is desinged for bulk initialization. a acts as the starting register index, and b is a counter for how many additional consecutive registers to set to null.
The Binary Macro
A massive portion of a virtual machine’s job is basic arithmetic: subtraction, multiplication, division, modulo.
To execute a subtraction (OP_SUB 0, 1, 2), the VM must:
-
Fetch the
Valuefrom Register 1 (B). -
Fetch the
Valuefrom Register 2 (C). -
Verify that both values are valid.
-
Extract the raw C
doublefrom the structs. -
Perform the C math (
b - c). -
Wrap the result back into a
Valuestruct. -
Store it in Register 0 (A).
We can wrap this exact sequence into a macro defined right above the dispatch loop:
1
2
3
4
5
6
7
8
9
10
#define BI_OP(type, op) \
do { \
Value b = R(GET_ARG_B(instruction)); \
Value c = R(GET_ARG_C(instruction)); \
if(!IS_NUM(b) || !IS_NUM(c)){ \
runtimeError(vm, "Operands must be numbers."); \
return VM_RUNTIME_ERROR; \
} \
R(GET_ARG_A(instruction)) = type(AS_NUM(b) op AS_NUM(c)); \
} while(false)
By abstracting the runtime type checking and unwrapping, our actual opcode handlers become one-liners:
1
2
3
DO_OP_SUB: BI_OP(NUM_VAL, -); DISPATCH();
DO_OP_MUL: BI_OP(NUM_VAL, *); DISPATCH();
You might wonder why we didn’t include OP_ADD in the BI_OP list. In many dynamic languages, the + operator is heavily overloaded. If you add two numbers, it performs math. If you add two strings, it concatenates them.
Because OP_ADD must handle dynamic string allocation, it requires its own dedicated block:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
DO_OP_ADD:
{
Value b = R(GET_ARG_B(instruction));
Value c = R(GET_ARG_C(instruction));
if(IS_NUM(b) && IS_NUM(c)){
double result = AS_NUM(b) + AS_NUM(c);
R(GET_ARG_A(instruction)) = NUM_VAL(result);
}else if(IS_STRING(b) || IS_STRING(c)){
// string operations ...
}else{
runtimeError(vm, "Operands must be two numbers or two strings.");
return VM_RUNTIME_ERROR;
}
} DISPATCH();
The VM checks the types dynamically at runtime. This is exactly what makes the language “dynamically typed”—the compiler doesn’t know what + does; it just emits OP_ADD. The Virtual Machine figures out the context on the fly.
Handling Errors
To actually prove our virtual machine works, we need a way to peer inside those registers and print the results, a way to gracefully stop the engine, and a safety net for when things go wrong.
In the early stages of building a language, the easiest way to inspect state is to hardcode a print instruction directly into the bytecode.
When the compiler parses a print statement, it emits OP_PRINT targeting the register holding the evaluated expression. Here is how the VM handles it in vm/vm.c:
1
2
3
4
5
6
DO_OP_PRINT:
{
int a = GET_ARG_A(instruction);
printValue(R(a));
printf("\n");
} DISPATCH();
Because our Value struct cleanly tags data as numbers, booleans, or nulls, the printValue() helper function in core/value.c can easily inspect the tag and route it to the correct C printf formatter.
1
2
3
4
5
6
7
8
9
10
11
12
13
void printValue(Value value){
if(IS_NULL(value)){
printf("null");
}else if(IS_BOOL(value)){
printf(AS_BOOL(value) ? "true" : "false");
}else if(IS_NUM(value)){
printf("%.14g", AS_NUM(value));
}else if(IS_OBJECT(value)){
printObject(value);
}else{
printf("Unknown value type");
}
}
Return
Our dispatch loop is a continuous goto web. It will blindly read memory and execute instructions forever unless we explicitly tell it to stop.
Every valid Chunk generated by our compiler must end with an OP_RETURN instruction. For a top-level script (ignoring function calls for a moment), hitting OP_RETURN means the script has finished executing.
While the full OP_RETURN implementation is quite complex (handling call frames, closures, and deferred statements), its core responsibility at the top level is simply to exit the loop and return a status code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
DO_OP_RETURN:
{
// ...
vm->frameCount--;
if(vm->frameCount == 0){
pop(vm);
return VM_OK;
}
// ...
} DISPATCH();
By dropping the frameCount to 0 and returning VM_OK, the C program breaks out of the run() function entirely, handing control back to the host application.
Runtime Errors
In C, if you try to divide by zero or access memory you shouldn’t, the operating system kills your program with a Segmentation Fault or Floating Point Exception.
A VM cannot allow this, even if the user’s script is catastrophic, the VM must remain stable. We must catch the illegal operations dynamically.
The runtimeError function in vm/vm.c is the safety net. It use C’s stdarg.h to accept formatted strings and generate highly specific error message.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
void runtimeError(VM* vm, const char* format, ...){
va_list args;
va_start(args, format);
vfprintf(stderr, format, args);
va_end(args);
fputs("\n", stderr);
/* Stack trace printing omitted for now
#ifdef DEBUG_TRACE
if(vm->frameCount > 0){
CallFrame* frame = &vm->frames[vm->frameCount - 1];
size_t instructionOffset = frame->ip - frame->closure->func->chunk.code -1;
int line = getLine(&frame->closure->func->chunk, instructionOffset);
const char* srcName = frame->closure->func->srcName != NULL
? frame->closure->func->srcName->chars
: "<script>";
fprintf(stderr, "Runtime error [%s, line %d]\n", srcName, line);
}else{
fprintf(stderr, "Runtime error [No stack trace available]\n");
}
#endif
*/
resetStack(vm);
}
The most crucial line in this function is the very last one: resetStack(vm). If a script crashes halfway, the global stack array is full of leftover garbage data, and stackTop is pointing somewhere in the middle of memory. By calling resetStack(vm), the VM instantly wipes the slate clean. It pushes stackTop back to stack[0], entirely resetting the execution environment so the interpreter is ready to safely run the next script.
Wrapping Up
In this article, we finally breathed life into that data. We built the Virtual Machine. The interpreter in this stage is finally capable of executing a linear sequence of instructions, performing calculations, and printing the results to the screen.
However, if you look closely at what this Virtual Machine can currently do, you will realize a harsh truth: It is not yet a programming language. Right now, it is just an engineered calculator.
A true programming language must be able to make decisions. It must be able to execute different blocks of code based on dynamic conditions (if / else), and it must be able to repeat tasks indefinitely (while / for). Also it will be able to have functions, closures, OOP and many more.
