Article by Ayman Alheraki on January 11 2026 10:37 AM
github : https://github.com/ForgeLang/LearnSeries
In building a modular interpreter using modern C++20/23, one of the most important engineering tasks is to carefully define and manage dependencies between core components: the lexer, parser, and runtime. These three modules form the heart of any language engine, and poor coupling among them can result in architectural rigidity, untestable code, and difficult feature expansion.
This section describes how to architect dependencies between these components using clear boundaries, modern C++ idioms, and compiler-level guarantees. The goal is to achieve modular cohesion, minimal coupling, and maximum clarity, using idiomatic C++20/23 design principles.
In an interpreter or compiler, each phase of the pipeline must consume only what it needs, and produce outputs suitable for the next stage, without relying on runtime details of other phases. Poor separation typically manifests in:
Lexer calling parser logic (bad design)
Runtime logic embedded into AST nodes
Cyclical dependencies between semantic analysis and evaluation
Direct variable sharing or implicit globals
A clear dependency model eliminates these problems and enables unit testing, parallel development, and future extensions, such as JIT compilation, static analysis, or embedded REPLs.
The architecture follows a layered structure:
x
┌────────────┐ │ Source │ └────┬───────┘ ▼ ┌──────────────┐ │ Lexer │ → TokenStream └────┬─────────┘ ▼ ┌──────────────┐ │ Parser │ → AST └────┬─────────┘ ▼ ┌──────────────┐ │ Semantics │ → Typed AST / Checked AST └────┬─────────┘ ▼ ┌──────────────┐ │ Runtime │ → Evaluation, Built-ins, Stack Frames └──────────────┘Only downward dependencies are allowed. Each layer depends only on the layer(s) below it, never above.
Purpose:
Accepts a source buffer (std::string_view)
Produces a linear sequence of tokens (TokenStream)
Does not depend on AST, parser, or runtime
Dependencies:
core/SourceManager.hpp for span tracking
core/ErrorReporter.hpp for structured errors
Standard library only
Output:
Token structure with type, lexeme, source location
TokenStream: typically std::vector<Token> or an iterator view
Modern C++ Tools:
std::string_view, std::optional, std::variant for token values
std::source_location (C++20) for diagnostics
constexpr lexers for testing and compile-time evaluation
Purpose:
Receives a stream of tokens from the lexer
Produces an Abstract Syntax Tree (AST)
Does not call or rely on runtime evaluation
Dependencies:
lexer/Token.hpp
core/SourceManager.hpp
core/ErrorReporter.hpp
Internal AST definitions (ast/Expr.hpp, ast/Stmt.hpp)
Output:
AST nodes represented using std::variant or algebraic types
For example:
using Expr = std::variant<BinaryExpr, LiteralExpr, CallExpr, VarExpr>;Design Notes:
Use recursive descent with backtracking or lookahead
Report syntax errors gracefully via std::expected or error accumulation
Modern C++ Tools:
std::visit, std::monostate for variant traversal
Concepts and constraints for AST transformations
Purpose:
Receives a fully parsed and semantically valid AST
Executes the program in an environment model
Provides built-in functions, memory, control flow, stack
Dependencies:
AST definitions
Symbol table or semantic results
core/Environment.hpp, core/Value.hpp, and runtime data structures
Key Responsibilities:
Interprets AST using a visitor or evaluation engine
Manages a scoped environment (stack frames, variables, closures)
Provides built-in I/O and standard library functions
Output:
Result of program execution (Value)
Can be used for REPL or scripting interface
Modern C++ Tools:
std::variant to represent runtime values:
using Value = std::variant<IntValue, FloatValue, BoolValue, StringValue, FunctionValue>;std::function or lambdas for native function calls
std::jthread, std::future, or coroutine-based async features for concurrency (optional)
Let’s say we define libraries like this in CMakeLists.txt:
x
add_library(ForgeCore ...)add_library(ForgeLexer ...)add_library(ForgeParser ...)add_library(ForgeRuntime ...)
# Dependenciestarget_link_libraries(ForgeLexer PUBLIC ForgeCore)target_link_libraries(ForgeParser PUBLIC ForgeLexer ForgeCore)target_link_libraries(ForgeRuntime PUBLIC ForgeParser ForgeCore)This ensures that:
Lexer is completely independent
Parser depends only on Lexer and Core
Runtime depends only on Parser and Core
No module introduces upward or cyclic dependencies
A key architectural boundary lies between:
Parser output (AST)
Runtime input (Value system)
The AST must never carry runtime values. This enforces purity and makes the AST reusable for:
Static analysis
Code formatting
Type checking
Compilation or transpilation
All runtime data is generated and stored after parsing, within the evaluation engine.
To maintain decoupling between syntax and execution, a semantic layer can act as an intermediate verifier:
Ensures type correctness
Infers return types
Validates function signatures
Annotates AST nodes with resolved types or symbol bindings
This layer produces either:
A typed AST (decorated with type info)
Or a symbol table used by the runtime
It can optionally cache or transform parts of the AST for optimization.
To maintain a clean runtime interface:
Built-in functions (e.g., print, input, len) are registered explicitly
External native libraries can be loaded dynamically or statically linked
Runtime APIs are defined through interfaces and Value conversions
In C++20/23, this is aided by:
std::function<Value(const std::vector<Value>&)> for native function wrappers
std::span for passing slices safely
Optional use of modules to isolate standard library extensions
Modular dependency design enables:
Lexer unit tests using raw source strings
Parser tests using token streams
AST tests using manually constructed nodes
Runtime tests using evaluation contexts and mock environments
Example: Test just the parser:
lexer(source);Parser parser(lexer.tokenize());AST ast = parser.parse_expression();// Assertions on structureManaging dependencies between the lexer, parser, and runtime is a foundational engineering principle when building a modern interpreter. With modern C++20/23, developers can enforce type-safe boundaries, minimize coupling, and structure their interpreter as a collection of focused, testable units.
By keeping components strictly layered and avoiding runtime dependencies in syntax and analysis phases, we enable a clean, composable, and scalable language infrastructure that supports both growth and correctness.