Article by Ayman Alheraki on January 11 2026 10:37 AM
This is a modular, scalable code generator for ForgeVM, focusing on direct translation from a high-level representation (AST → IR) to native assembly or machine code for x86-64 and ARM64.
[ Source Code (Any Language) ] ↓ [ ForgeVM Frontend Parser & AST Builder ] ↓ [ ForgeVM Intermediate Representation (IR) ] ↓ ┌────────────┬────────────┬────────────┐ │ x86_64 │ ARM64 │ RISC-V │ ← Backends └────────────┴────────────┴────────────┘ ↓ [ Assembly or Raw Machine Code ] ↓ [ Executable / Binary ]You skip the virtual machine and intermediate bytecode, and directly generate native code via IR → backend.
This is a minimal and portable abstraction of instructions, general enough to represent logic across all CPUs.
Easy to parse and serialize (JSON or binary)
Directly translatable to native CPU instructions
Not Turing-complete, only representable actions (no loops or conditions unless explicitly encoded)
xxxxxxxxxx[ { "op": "mov", "dest": "r1", "value": 42 }, { "op": "mov", "dest": "r2", "value": 10 }, { "op": "add", "dest": "r1", "src": "r2" }, { "op": "call", "func": "print", "args": ["r1"] }, { "op": "ret" }]| IR Instruction | Description |
|---|---|
mov | Move constant or register value to another register |
add / sub | Arithmetic operations |
mul / div | Optional: Arithmetic ops |
call | Call external function |
ret | Return from function |
cmp, jmp, je, jne | Conditional execution (phase 2) |
| IR Reg | x86-64 Reg |
|---|---|
| r1 | rax |
| r2 | rbx |
| r3 | rcx |
| ... | r8–r15 |
| IR | x86_64 ASM |
|---|---|
mov r1, 42 | mov rax, 42 |
mov r2, 10 | mov rbx, 10 |
add r1, r2 | add rax, rbx |
call print | call print (must be declared) |
ret | ret |
xsection .textglobal _start
_start: mov rax, 42 mov rbx, 10 add rax, rbx call print retUse nasm or gas to assemble this into an executable
Or link with C runtime if using main: entry
| IR Reg | ARM64 Reg |
|---|---|
| r1 | x0 |
| r2 | x1 |
| r3 | x2 |
| ... | x3–x28 |
| IR | ARM64 ASM |
|---|---|
mov r1, 42 | mov x0, #42 |
mov r2, 10 | mov x1, #10 |
add r1, r2 | add x0, x0, x1 |
call print | bl print (branch with link) |
ret | ret |
xxxxxxxxxx.global _start
_start: mov x0, #42 mov x1, #10 add x0, x0, x1 bl print retOutput .s file
Assemble with:
nasm for x86_64
as or clang -c for ARM64
Link with ld or clang
For a JIT-style design, allocate RWX memory, write machine code, and execute it
On Linux: mmap + mprotect
On Windows: VirtualAlloc
xxxxxxxxxxvoid generate_x86(const IRProgram& program, std::ostream& out) { out << "section .text\n"; out << "global _start\n\n"; out << "_start:\n";
for (auto& instr : program.instructions) { if (instr.op == "mov") { out << " mov rax, " << instr.value << "\n"; } else if (instr.op == "add") { out << " add rax, rbx\n"; } else if (instr.op == "ret") { out << " ret\n"; } }}Conditional branches: cmp, jmp, je, etc.
Function calls & stack frame layout
Calling convention support (System V ABI on Linux, Windows x64 ABI)
Register allocation algorithm
Live variable analysis for optimization
Ensure instructions do not break calling conventions
Analyze for register clobbering or invalid instructions
xxxxxxxxxxforgevmc -arch x86 -in prog.ir -out prog.snasm -f elf64 prog.s -o prog.old prog.o -o prog.out
| Purpose | Library |
|---|---|
| Emit machine code | AsmJit, Keystone |
| Parse JSON IR | RapidJSON, nlohmann/json |
| Assembler | NASM, GAS, Clang |
| Optional LLVM link | Use LLVM MC layer if needed |
Your ForgeVM code generator:
Accepts high-level source or IR
Targets native code directly (x86-64, ARM64)
Avoids bytecode completely
Enables modular backend expansion
Long-term: can evolve into an optimizing native compiler
You can skip the IR entirely and go straight from the AST (Abstract Syntax Tree) or even the parser output directly to native code (assembly or machine code). This approach is called direct code generation.
Tiny Language / DSL (Domain-Specific Language)
If you're designing a small language (e.g., configuration language, scripting for a game engine), you can generate native code directly from syntax trees.
No complex transformations or optimizations are needed.
Educational Compilers
In tutorials or university projects, IR may be unnecessary. Teaching can focus on the basics of parsing and code generation.
1:1 Language Translation (Source-to-Source)
Translators that map language A to language B (e.g., transpiling Pascal to C++) may not need an IR if the grammar and semantics align well.
Extremely Simple VMs
If you're writing a VM that executes instructions directly from a high-level language like Lisp or BASIC, you might interpret the AST or token stream directly.
Special-Purpose Ahead-of-Time Compilers
If you compile only a small subset of a language, or compile fixed templates (e.g., SQL query compilers), you can emit machine code directly.
No Cross-Architecture Abstraction
Without IR, you must generate a new backend for every CPU architecture directly from the AST or parser. That makes multi-targeting harder.
Limited Optimizations
You lose a centralized phase where optimizations like dead-code elimination, constant folding, inlining, or register allocation could happen.
Difficult to Reuse Logic Across Architectures
If you want to support both x86 and ARM, it’s harder without a neutral IR between the parser and the backend.
Less Debugging Flexibility
You cannot inspect or analyze intermediate steps, which hurts debugging and testing.
| Criteria | Skip IR? |
|---|---|
| Language size: Small | Yes |
| Performance: Low demand | Yes |
| Portability: Not required | Yes |
| You control all target platforms | Yes |
| You need optimization or architecture independence | No |
Let’s say you have a very small scripting language:
xxxxxxxxxxprint(1 + 2)You can generate x86-64 directly:
xxxxxxxxxxmov rax, 1add rax, 2call printretYou don’t need to convert 1 + 2 into IR like:
xxxxxxxxxx{ "op": "add", "src1": "const 1", "src2": "const 2" }Just generate the machine code or assembly on the fly.
You can skip IR if:
Your language is small, simple, or static.
You are targeting only one platform.
You don’t need cross-architecture support or aggressive optimizations.
But for large, portable, or optimized languages, an IR is strongly recommended.