Building a Register VM Interpreter - Part 6: Single-Pass Pratt Parsing
This guide is about building a register-based VM interpreter from scratch. In a single-pass architecture, we do not build an AST. There are no intermediate tree nodes sitting in memory. Instead, we read a token, figure out what it means in its context, and emit the bytecode for it right then and there.
In the previous parts of this series, a static infrastructure has been constructed. We defined a Value system to represent dynamic types in memory, built a scanner to tokenize source code, and designed a bytecode format to pack the executable instructions into chunks. A disassembler is also built for better debugging, which gives us a peek into the compiled output and verify the work.
However, a massive chasm currently separates our token stream from our bytecode. The interpreter needs an engine to read and consume those tokens, understanding the structure and intent of the program. Then engine then translate the tokens into the correct sequence of 32-bit instructions. This is the job of Parser.
Let’s take a look at many modern compilers like those for Java, Python or even modern C++, parsing is typically a heavy, multi-step process. The compiler usually scans the tokens and builds a massive, interconnected data structure called an Abstract Syntax Tree (AST). It then runs optimization passed over this tree, and finally traverse through the tree nodes one by one to emit bytecode. AST offers significant advantages which aids in code analysis, transformation, and optimization. It enables precise, structure-aware transformations while preserving program correctness, support language-agnostic generation, and improve maintainability. However, generating an AST is an additional step in the compilation process. This adds some overhead compared to simpler approaches. Writing a AST system is also harder and more complex.
For this interpreter, we are taking adifferent route. To keep the language fast, memory-efficient, and of cource cleanly implemented in C, we are building a single-pass compiler.
In a single-pass architecture, we do not build an AST. There are no intermediate tree nodes sitting in memory. Instead, we read a token, figure out what it means in its context, and emit the bytecode for it right then and there.
The strategy of single-pass raises a problem: Operator Precedence. How do the interpreter parse the expression like 1 + 2 * 3 in a single pass? With out a tree to explicitly group the 2 * 3 into a single node, how does the compiler know to delay the addition and prioritize the multiplication.
To solve the problem elegantly, we can use a technique called Pratt Parsing (or known as Top-Down Operator Precedence). Invented by Vaughan Pratt in 1973, this approach ditches complex grammer rules in favor of a simple, table-driven achitecture. In this algorithm, we assign a binding power to every operator. By doing this, a Pratt parser can navigate infinitely nested expressions, enforce precedence rules, and emit left-associative of right-associative bytecode, all through a single, tight loop.
Well it looks elegant, and perfectly suited for this VM. Let’s dive into how to track the state of the parser and define these binding powers.
Parser State and Rule Table
Before we can start parsing a single expression, we need to set up the environment. A single-pass compiler is inherently stateful, it reads through the source code token by token, making decisions on the fly.
To manage this, we need a way to track out current position in the token stream. In compiler/compiler.h, we define a very lightweigh Parser structure:
1
2
3
4
5
6
7
8
9
10
typedef struct{
Token pre;
// Previous token
Token cur;
// Current token
bool hadError;
// Flag to indicate if a compilation error
bool panic;
// Flag to indicate if in panic mode
}Parser;
Here is a trick, we maintain both a cur (current) and pre (previous) token. In Pratt parsing, we usually look at the cur token to decide what parsing function to call, and then we immediately consume it. By the time parsing function actually runs, the token it needs to evalute is now sitting in the pre slot. The error flags (hadError and panic) allow us to gracefully catch syntax errors and synchronize the parser without crashing the VM.
Binding Power
The entire machanism of Pratt parsing revolves around the binding power, which indicates how tightly an operatior holds onto its neighboring values. In C, we can elegantly represent the binding power using an enum.
In compiler/compiler.c, we define the Precedence levels from the lowest to highest:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
typedef enum{
PREC_NONE,
PREC_ASSIGNMENT, // =
PREC_TERNARY, // ?:
PREC_OR, // or
PREC_AND, // and
PREC_PIPE, // |>
PREC_EQUALITY, // == !=
PREC_COMPARISON, // < > <= >=
PREC_TERM, // + -
PREC_FACTOR, // * /
PREC_UNARY, // ! - (unary minus)
PREC_CALL, // . () []
PREC_PRIMARY // identifiers, literals, grouping
}Precedence;
Multiplication (*) is a PREC_FACTOR, while addition (+) is a PREC_TERM. Because PREC_FACTOR is further down the list, it has a larger integer value, meaning it binds more tightly than addition. When the parser encounters 1 + 2 * 3, this numerical difference is what tells the compiler to group the 2 and 3 together.
Prefix and Infix
The core revelation of Vaughan Pratt’s algorithm is that a token’s meaning changes depending on where it appears.
For example, consider the minus sign (-):
-
In the expression
-5, the minus sign appears at the start of the expression. It acts as a prefix (unary operator). -
In the expression
6 - 5, the minus sign appears between two numbers. It acts as an infix (binary operator).
To implement this, we can assosiate every token type with two potential parsing functions: one for acting as a prefix, and another for acting as an infix.
We define this in compiler/compiler.c using a ParseRule struct:
1
2
3
4
5
6
7
typedef void (*ParseFunc)(Compiler* compiler, ExprDesc* expr, bool canAssign);
typedef struct{
ParseFunc prefix; // Function to handle prefix parsing
ParseFunc infix; // Function to handle infix parsing
Precedence precedence; // Precedence of the operator
}ParseRule;
We will cover the ExprDesc struct and canAssign later. For now, just know that ParseFunc is a function pointer that knows how to parse a specific part of an expression.
Lookup Rule Table
Finally, we tie it all together in a massive array called rules[]. This array acts as a lookup table. Given any TokenType, we can instantly retrieve its prefix function, its infix function, and its precedence.
Using C99’s designated initializers, mapping this out is quite readable:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
ParseRule rules[] = {
[TOKEN_LEFT_PAREN] = {handleGrouping, handleCall, PREC_CALL},
[TOKEN_RIGHT_PAREN] = {NULL, NULL, PREC_NONE},
[TOKEN_LEFT_BRACE] = {handleMap, NULL, PREC_NONE},
[TOKEN_RIGHT_BRACE] = {NULL, NULL, PREC_NONE},
[TOKEN_LEFT_BRACKET] = {handleList, handleIndex, PREC_CALL},
[TOKEN_COMMA] = {NULL, NULL, PREC_NONE},
[TOKEN_SEMICOLON] = {NULL, NULL, PREC_NONE},
[TOKEN_PLUS] = {NULL, handleBinary, PREC_TERM},
[TOKEN_MINUS] = {handleUnary, handleBinary, PREC_TERM},
[TOKEN_STAR] = {NULL, handleBinary, PREC_FACTOR},
[TOKEN_SLASH] = {NULL, handleBinary, PREC_FACTOR},
[TOKEN_PERCENT] = {NULL, handleBinary, PREC_FACTOR},
[TOKEN_PLUS_EQUAL] = {NULL, NULL, PREC_NONE},
[TOKEN_MINUS_EQUAL] = {NULL, NULL, PREC_NONE},
[TOKEN_PLUS_PLUS] = {handlePrefix, handlePostfix, PREC_CALL},
[TOKEN_MINUS_MINUS] = {handlePrefix, handlePostfix, PREC_CALL},
[TOKEN_QUESTION] = {NULL, handleTernary, PREC_TERNARY},
[TOKEN_NUMBER] = {handleNum, NULL, PREC_NONE},
[TOKEN_IDENTIFIER] = {handleVar, NULL, PREC_NONE},
[TOKEN_STRING_START] = {handleString, NULL, PREC_NONE},
[TOKEN_STRING_END] = {NULL, NULL, PREC_NONE},
[TOKEN_INTERPOLATION_START] = {NULL, NULL, PREC_NONE},
[TOKEN_INTERPOLATION_END] = {NULL, NULL, PREC_NONE},
[TOKEN_INTERPOLATION_CONTENT] = {NULL, NULL, PREC_NONE},
[TOKEN_NULL] = {handleLiteral, NULL, PREC_NONE},
[TOKEN_TRUE] = {handleLiteral, NULL, PREC_NONE},
[TOKEN_FALSE] = {handleLiteral, NULL, PREC_NONE},
[TOKEN_NOT] = {handleUnary, NULL, PREC_UNARY},
[TOKEN_NOT_EQUAL] = {NULL, handleBinary, PREC_EQUALITY},
[TOKEN_EQUAL] = {NULL, handleBinary, PREC_EQUALITY},
[TOKEN_GREATER] = {NULL, handleBinary, PREC_COMPARISON},
[TOKEN_LESS] = {NULL, handleBinary, PREC_COMPARISON},
[TOKEN_GREATER_EQUAL] = {NULL, handleBinary, PREC_COMPARISON},
[TOKEN_LESS_EQUAL] = {NULL, handleBinary, PREC_COMPARISON},
[TOKEN_AND] = {NULL, handleAnd, PREC_AND},
[TOKEN_OR] = {NULL, handleOr, PREC_OR},
[TOKEN_IMPORT] = {handleImport, NULL, PREC_NONE},
[TOKEN_DOT] = {NULL, handleDot, PREC_CALL},
[TOKEN_THIS] = {handleThis, NULL, PREC_NONE},
[TOKEN_PIPE] = {NULL, handlePipe, PREC_PIPE},
[TOKEN_FUNC] = {funcExpr, NULL, PREC_NONE},
[TOKEN_EOF] = {NULL, NULL, PREC_NONE},
};
It allows you to initialize specific elements of an array or members of a structure by explicitly naming their index or field, rather than relying on their positional order.
The [TOKEN_NAME] syntax specifies the exact array index to initialize. Each entry is placed at the index defined by its corresponding token constant. This pattern is extremely common in compiler development.
Take it look at the table. A TOKEN_NUMBER for example:
1
[TOKEN_NUMBER] = {handleNum, NULL, PREC_NONE},
TOKEN_NUMBER only has a prefix function (handleNum) because a number cannot sit between two expressions to join them. Same for a TOKEN_STAR, it only has an infex function handleBinary because we cannot start an expression with a multiplication sign. But TOKEN_MINUS has both. The parser will dynamically choose which one to call based on where it encounters the minus sign.
With our state tracked and our rules strictly defined, we are ready to build the engine that drives this data.
Parse Precedence
Now we have the rules defined and precedence mapped out. We then need the engine to drive them. In a Pratt parser, almost every expression-parsing task is delegated to a single central fuction called parsePrecedence. Whenever the compiler needs to evaluate an expression, it calls this function and passes in the lowest precedence level.
Prefix Hook
Every valid expression must start with a prefix token. It could be a number, a minus sign for negation, a grouping parenthesis, or a variable name. It cannot start with an infix operator like * or +.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
static void parsePrecedence(Compiler* compiler, ExprDesc* expr, Precedence precedence){
initExpr(expr, EXPR_VOID, 0);
advance(compiler);
ParseFunc preRule = getRule(compiler->parser.pre.type)->prefix;
if(preRule == NULL){
errorAt(compiler, &compiler->parser.pre, "Expect expression");
return;
}
bool canAssign = precedence <= PREC_ASSIGNMENT;
preRule(compiler, expr, canAssign);
// ...
}
When the parsePrecedence starts, it immediately consumes the next token using advance(). It then looks up the table and grabs its prefix function.
1
2
3
4
5
6
7
8
9
10
static void advance(Compiler* compiler){
compiler->parser.pre = compiler->parser.cur;
while(true){
compiler->parser.cur = scan();
if(compiler->parser.cur.type != TOKEN_ERROR){
break;
}
errorAt(compiler, &compiler->parser.cur, "Unexpected token: %.*s");
}
}
Then getRule is quite straightforward:
1
static inline ParseRule* getRule(TokenType type){return &rules[type];}
If the prefix rule is NULL, the parser instantly throws an “Expect Expression” error. Otherwise, it executes the prefix rule, passing along the other variables like expr descriptor and the canAssign flag. This prefix rule will do whatever it needs to do—if it’s a number, it will load the constant; if it’s a parenthesis, it will parse the grouped expression inside. Once preRule finishes, the left side of the expression is parsed.
Infix Loop
Till now, we have parsed the beginning of the expression. THenm we need to enter the infix loop:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
tatic void parsePrecedence(Compiler* compiler, ExprDesc* expr, Precedence precedence){
//...
bool canAssign = precedence <= PREC_ASSIGNMENT;
preRule(compiler, expr, canAssign);
while(precedence <= getRule(compiler->parser.cur.type)->precedence){
advance(compiler);
ParseFunc inRule = getRule(compiler->parser.pre.type)->infix;
inRule(compiler, expr, canAssign);
}
// ...
}
This while loop is the key to handle the operator precedence and left-assosiativity without needing an AST.
The loop asks a simple question: “Is the precedence of the next token greater than or equal to the level I am currently parsing?”
If the answer is yes, it calls advance to consume the infix token, look up its infix rule, and execute it.
To see why this is brilliant, let’s trace the expression 1 + 2 * 3 and break it down. The compiler starts evaluating this by calling parsePrecedence. advance consumes 1, the prefix rule handleNum will run. The loop checks the next token +, whose precedence is PREC_TERM. Since PREC_TERM is greater than PREC_ASSIGNMENT, the loop enters. advance will consume + which calls the handleBinary (the infix rule for +).
Here is a recursive trick, inside the handleBinary, it actually needs to parse the right-hand side. It will recursively calls parsePrecedence, but passes a higher precedence: PREC_TERM + 1 (which is PREC_FACTOR).
1
2
3
4
5
6
7
8
9
10
11
static void handleBinary(Compiler* compiler, ExprDesc* expr, bool canAssign){
TokenType type = compiler->parser.pre.type;
ParseRule* rule = getRule(type);
ExprDesc right;
parsePrecedence(
compiler,
&right,
(Precedence)(rule->precedence + 1)
);
// parse the right-hand side, parse only if precedence is higher
While parsing 2, the inner parsePrecedence consumes 2 and runs handleNum. The inner loop checks the next token (*). Its precedence is PREC_FACTOR. Since PREC_FACTOR >= PREC_FACTOR (the current level), the loop enters. advance() consumes * and calls handleBinary for the multiplication. It parses 3. The inner loop sees the end of the expression (or a lower precedence token). It returns. The multiplication is grouped and emitted before the outer handleBinary finishes emitting the addition.
Because handleBinary passes precedence + 1 down to the next level, it strictly enforces that higher-precedence operators (like *) bind their operands before lower-precedence operators (like +) get a chance to finish.
This tight, elegant loop is the entire reason we can skip building an AST.
Rules in Action
Let’s look at how we implement these prefix rules in C, starting from the simplest case and moving to recursive ones.
Number Literals
When the parser encounters a TOKEN_NUMBER, parsePrecedence calls the prefix rule handleNum.
Here is the inplementation in compiler/compiler.c:
1
2
3
4
5
static void handleNum(Compiler* compiler, ExprDesc* expr, bool canAssign){
double value = strtod(compiler->parser.pre.head, NULL);
int constIndex = makeConstant(compiler, NUM_VAL(value));
initExpr(expr, EXPR_K, constIndex);
}
We use the standard library strtod to convert the token’s text into a C double.
Grouping Parentheses
If an expression starts with an opening parenthesis, like (1 + 2), we need to make a rule for TOKEN_LEFT_PAREN points to handleGrouping:
1
2
3
4
static void handleGrouping(Compiler* compiler, ExprDesc* expr, bool canAssign){
expression(compiler, expr);
consume(compiler, TOKEN_RIGHT_PAREN, "Expected ')' after expression");
}
That is the entire function. It relies on a helper function called expression(), which is simply a wrapper around parsePrecedence:
1
2
3
static void expression(Compiler* compiler, ExprDesc* expr){
parsePrecedence(compiler, expr, PREC_ASSIGNMENT);
}
By calling expression(), the parser recursively drops back into the main parsePrecedence loop at the lowest precedence level (PREC_ASSIGNMENT).
It will continuously parse tokens (like 1, +, and 2) until it hits a token that has a lower precedence than PREC_ASSIGNMENT—which, in this case, is the closing parenthesis ). Once the inner expression finishes, handleGrouping simply calls consume() to swallow that closing parenthesis.
Delaying Unary
For unary operators like - (negation) and ! (logical NOT), if the parser sees these prefix token, it triggers handleUnary:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
static void handleUnary(Compiler* compiler, ExprDesc* expr, bool canAssign){
TokenType type = compiler->parser.pre.type;
parsePrecedence(compiler, expr, PREC_UNARY);
/*
// Register allocation code
expr2NextReg(compiler, expr);
freeExpr(compiler, expr);
reserveReg(compiler, 1);
int targetReg = getFreeReg(compiler) - 1;
*/
// emit bytecode
switch(type){
case TOKEN_MINUS:
emitABC(compiler, OP_NEG, targetReg, expr->data.loc.index, 0);
break;
case TOKEN_NOT:
emitABC(compiler, OP_NOT, targetReg, expr->data.loc.index, 0);
break;
default: return; // Should not reach here
}
initExpr(expr, EXPR_REG, targetReg);
}
For -1 as an example. We call parsePrecedence with PREC_UNARY. This guarantees we only consume tokens that bind as tightly as a unary operator. It will consume the numbers (using handleNum) and return. In the switch statement, we emit the actual OP_NEG bytecode instruction, telling the VM to negate the value in the operand’s register and store it in the targetReg.
Through these prefix rules, we can seamlessly parse the beginnings of any expression. But math rarely stops at a prefix.
Infix Rules
When the parser encounters a token like + or *, the left-hand side of the expression has already been completely parsed. To consume the operator and the right-hand side, the loop calls the token’s infix rule, a handleBinary:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
static void handleBinary(Compiler* compiler, ExprDesc* expr, bool canAssign){
TokenType type = compiler->parser.pre.type;
ParseRule* rule = getRule(type);
ExprDesc right;
parsePrecedence(compiler, &right, (Precedence)(rule->precedence + 1));
// parse the right-hand side, parse only if precedence is higher
switch(type){
case TOKEN_PLUS: emitBinaryOp(compiler, OP_ADD, expr, &right); break;
case TOKEN_MINUS: emitBinaryOp(compiler, OP_SUB, expr, &right); break;
case TOKEN_STAR: emitBinaryOp(compiler, OP_MUL, expr, &right); break;
case TOKEN_SLASH: emitBinaryOp(compiler, OP_DIV, expr, &right); break;
case TOKEN_PERCENT: emitBinaryOp(compiler, OP_MOD, expr, &right); break;
case TOKEN_EQUAL:
case TOKEN_NOT_EQUAL:
case TOKEN_GREATER:
case TOKEN_LESS:
case TOKEN_GREATER_EQUAL:
case TOKEN_LESS_EQUAL:{
OpCode op;
int expectTrue = 1;
switch(type){
case TOKEN_EQUAL: op = OP_EQ; break;
case TOKEN_NOT_EQUAL: op = OP_EQ; expectTrue = 0; break;
case TOKEN_LESS: op = OP_LT; break;
case TOKEN_LESS_EQUAL: op = OP_LE; break;
case TOKEN_GREATER: op = OP_LE; expectTrue = 0; break;
case TOKEN_GREATER_EQUAL: op = OP_LT; expectTrue = 0; break;
default: op = OP_EQ; break; // Should not reach here
}
// ...
default: return;
}
}
This function can identify the operator, parse the right operand and emit the instruction. The most important line in this entire parser is the recursive call to parsePrecedence. Notice that when handleBinary calls parsePrecedence to parse the right-hand side, it does not pass the operator’s current precedence. It passes rule->precedence + 1. Because mathematical operators in programming are usually left-associative, which means, if you write a 1 + 2 + 3, it should be evaluated as (1 + 2) + 3. This +1 trick forces handleBinary to finish executing, emitting the OP_ADD for 1 + 2 before the outer loop consume the second +:
-
The parser reads
1. -
The loop in
parsePrecedenceenters because+hasPREC_TERM. -
handleBinaryis called. It recursively callsparsePrecedencewithPREC_TERM + 1. -
This inner
parsePrecedencereads the2. -
The inner parser is looking ar the next token, which is the second
+. The parser checks whether the precedence of+(PREC_TERM) is greater than or equal to the requested levelPREC_FACTOR(PREC_TERM + 1). The answer is no. -
Because the condition fails, the inner
parsePrecedenceimmediately stops and returns.
If we had not added the + 1, the inner loop would have swallowed the second +, resulting in right-associativity 1 + (2 + 3), which will be mathematically incorrect for subtraction and division.
Emitting
Once the parsePrecedence returns, the ExprDesc right contains the parsed right-hand side. Now we just need to tie the left and right sides together. We do this via a helper function, emitBinaryOp:
1
2
3
4
5
6
7
8
9
10
11
12
13
static void emitBinaryOp(Compiler* compiler, OpCode op, ExprDesc* left, ExprDesc* right){
/*
// allocating registers
expr2NextReg(compiler, left);
expr2NextReg(compiler, right);
freeExpr(compiler, right);
freeExpr(compiler, left);
*/
int instructionIndex = emitABC(compiler, op, 0, left->data.loc.index, right->data.loc.index);
left->type = EXPR_TBD;
left->data.loc.index = instructionIndex;
}
From the compiler’s perspective, the job is done. It use the macro we defined previously (like CREATE_ABC) to pack the OpCode and the register indices into a single 32-bit integer, and appends it to the Chunk’s bytecode array using emitABC.
Exactly how the Virtual Machine will read this 32-bit instruction and perform the math is a problem for the runtime engine, which we will build later.
The Assignment Guard
Consider the following expression:
1
a * b = c;
Syntactically and mathematically, this is nonsense. You cannot assign a value to the result of a multiplication.
Since we are doing a single-pass compilation, we don’t have a tree to inspect. If we were building an AST, catching this error would require traversing the generated tree and verifying that the left side of every = node is a valid assignment target (like a variable or property).
In a Pratt parser, we define a bool canAssign variable that is passed around to every parsing function. Let’s take a look back at the very top of parsePrecedence:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
if(preRule == NULL){
errorAt(compiler, &compiler->parser.pre, "Expect expression");
return;
}
bool canAssign = precedence <= PREC_ASSIGNMENT;
preRule(compiler, expr, canAssign);
while(precedence <= getRule(compiler->parser.cur.type)->precedence){
advance(compiler);
ParseFunc inRule = getRule(compiler->parser.pre.type)->infix;
inRule(compiler, expr, canAssign);
}
This boolean is only true if the current precedence level is low enough to allow assignment.
When we have a * b parsed, we reach the bottom of the outer parsePrecedence function:
1
2
3
4
5
6
7
8
9
if(canAssign){
if(match(compiler, TOKEN_ASSIGN)){
ExprDesc valExpr;
expression(compiler, &valExpr);
storeVar(compiler, expr, &valExpr);
*expr = valExpr;
}
else if(match(compiler, TOKEN_PLUS_EQUAL)){
// ...
When the parser hits the *, it dives into the inner loop (handleBinary) and calls parsePrecedence(PREC_FACTOR). In this inner loop, canAssign becomes false because PREC_FACTOR <= PREC_ASSIGNMENT is false. It physically prevents the parser from incorrectly grouping the expression as a * (b = c). The inner loop refuses to touch the = sign and simply returns the parsed b.
Because the outer loop was called with PREC_ASSIGNMENT, its canAssign flag is still true. It sees the = token, matches it, and tries to execute the assignment block. However, it will not cause a bug, the expr we are passing to storeVar represents the result of a * b, it is not a variable. Inside storeVar, we can simply do a check on the expression type:
1
2
3
4
5
6
7
8
static void storeVar(Compiler* compiler, ExprDesc* var, ExprDesc* val){
switch(var->type){
// ...
default:
errorAt(compiler, &compiler->parser.pre, "Invalid assignment target.");
break;
}
}
It may be confusing. In short, the canAssign tracks the context, not the target. The canAssign flag does not look at the variable itself. Instead, it checks whether the current level is legal to perform an assignment. canAssign == false forces the expression into the correct shape (preventing a * (b = c)). And then, storeVar enforces the semantic rules, preventing math result from being assigned to data.
So when the parser meets = in the outer loop, this canAssign should be true, if it were arbitrarily set to false, the compiler wouldn’t know how to parse the = token at all, leading to a confusing generic “Unexpected token” error instead of the “Invalid assignment target”.
The beauty of the canAssign flag combined with expression types is that we get the semantic validation of an AST compiler with the raw speed and simplicity of a single-pass parser.
Wrapping Up
With Vaughan Pratt’s algorithm, we have transformed a flat stream of tokens into deeply structured, precedence-aware operations using just a few dozen lines of C and a lookup table.
We can parse groupings, enforce left-associativity, and block invalid assignments—all without allocating a single AST node in memory.
But throughout the whole article, we have danced around ExprDesc, temporary slots and more. These major topics will be discussed in the later parts about register allocation and expression descriptor.
