Logo
Articles Compilers Libraries Books MiniBooklets Assembly C++ Linux Others Videos
Advertisement

Article by Ayman Alheraki on January 11 2026 10:37 AM

#9 Foundation and Architecture Language Implementation Project Structure - Dependency Management — Lexer, Parser, Run

#9 Foundation and Architecture: Language Implementation Project Structure -> Dependency Management — Lexer, Parser, Runtime

github : https://github.com/ForgeLang/LearnSeries

In building a modular interpreter using modern C++20/23, one of the most important engineering tasks is to carefully define and manage dependencies between core components: the lexer, parser, and runtime. These three modules form the heart of any language engine, and poor coupling among them can result in architectural rigidity, untestable code, and difficult feature expansion.

This section describes how to architect dependencies between these components using clear boundaries, modern C++ idioms, and compiler-level guarantees. The goal is to achieve modular cohesion, minimal coupling, and maximum clarity, using idiomatic C++20/23 design principles.

1. Why Dependency Management Matters in a Language Project

In an interpreter or compiler, each phase of the pipeline must consume only what it needs, and produce outputs suitable for the next stage, without relying on runtime details of other phases. Poor separation typically manifests in:

  • Lexer calling parser logic (bad design)

  • Runtime logic embedded into AST nodes

  • Cyclical dependencies between semantic analysis and evaluation

  • Direct variable sharing or implicit globals

A clear dependency model eliminates these problems and enables unit testing, parallel development, and future extensions, such as JIT compilation, static analysis, or embedded REPLs.

2. High-Level Component Boundaries

The architecture follows a layered structure:

Only downward dependencies are allowed. Each layer depends only on the layer(s) below it, never above.

3. Lexer: Input Tokenization Layer

Purpose:

  • Accepts a source buffer (std::string_view)

  • Produces a linear sequence of tokens (TokenStream)

  • Does not depend on AST, parser, or runtime

Dependencies:

  • core/SourceManager.hpp for span tracking

  • core/ErrorReporter.hpp for structured errors

  • Standard library only

Output:

  • Token structure with type, lexeme, source location

  • TokenStream: typically std::vector<Token> or an iterator view

Modern C++ Tools:

  • std::string_view, std::optional, std::variant for token values

  • std::source_location (C++20) for diagnostics

  • constexpr lexers for testing and compile-time evaluation

4. Parser: Syntax Construction Layer

Purpose:

  • Receives a stream of tokens from the lexer

  • Produces an Abstract Syntax Tree (AST)

  • Does not call or rely on runtime evaluation

Dependencies:

  • lexer/Token.hpp

  • core/SourceManager.hpp

  • core/ErrorReporter.hpp

  • Internal AST definitions (ast/Expr.hpp, ast/Stmt.hpp)

Output:

  • AST nodes represented using std::variant or algebraic types

  • For example:

Design Notes:

  • Use recursive descent with backtracking or lookahead

  • Report syntax errors gracefully via std::expected or error accumulation

Modern C++ Tools:

  • std::visit, std::monostate for variant traversal

  • Concepts and constraints for AST transformations

5. Runtime: Execution Layer

Purpose:

  • Receives a fully parsed and semantically valid AST

  • Executes the program in an environment model

  • Provides built-in functions, memory, control flow, stack

Dependencies:

  • AST definitions

  • Symbol table or semantic results

  • core/Environment.hpp, core/Value.hpp, and runtime data structures

Key Responsibilities:

  • Interprets AST using a visitor or evaluation engine

  • Manages a scoped environment (stack frames, variables, closures)

  • Provides built-in I/O and standard library functions

Output:

  • Result of program execution (Value)

  • Can be used for REPL or scripting interface

Modern C++ Tools:

  • std::variant to represent runtime values:

  • std::function or lambdas for native function calls

  • std::jthread, std::future, or coroutine-based async features for concurrency (optional)

6. Example of Dependency Direction (CMake and Code)

Let’s say we define libraries like this in CMakeLists.txt:

This ensures that:

  • Lexer is completely independent

  • Parser depends only on Lexer and Core

  • Runtime depends only on Parser and Core

  • No module introduces upward or cyclic dependencies

7. AST and Value Boundary

A key architectural boundary lies between:

  • Parser output (AST)

  • Runtime input (Value system)

The AST must never carry runtime values. This enforces purity and makes the AST reusable for:

  • Static analysis

  • Code formatting

  • Type checking

  • Compilation or transpilation

All runtime data is generated and stored after parsing, within the evaluation engine.

8. Semantic Analysis as an Optional Intermediate Layer

To maintain decoupling between syntax and execution, a semantic layer can act as an intermediate verifier:

  • Ensures type correctness

  • Infers return types

  • Validates function signatures

  • Annotates AST nodes with resolved types or symbol bindings

This layer produces either:

  • A typed AST (decorated with type info)

  • Or a symbol table used by the runtime

It can optionally cache or transform parts of the AST for optimization.

9. Runtime Extensions and Isolation

To maintain a clean runtime interface:

  • Built-in functions (e.g., print, input, len) are registered explicitly

  • External native libraries can be loaded dynamically or statically linked

  • Runtime APIs are defined through interfaces and Value conversions

In C++20/23, this is aided by:

  • std::function<Value(const std::vector<Value>&)> for native function wrappers

  • std::span for passing slices safely

  • Optional use of modules to isolate standard library extensions

10. Testing Each Component in Isolation

Modular dependency design enables:

  • Lexer unit tests using raw source strings

  • Parser tests using token streams

  • AST tests using manually constructed nodes

  • Runtime tests using evaluation contexts and mock environments

Example: Test just the parser:

Conclusion

Managing dependencies between the lexer, parser, and runtime is a foundational engineering principle when building a modern interpreter. With modern C++20/23, developers can enforce type-safe boundaries, minimize coupling, and structure their interpreter as a collection of focused, testable units.

By keeping components strictly layered and avoiding runtime dependencies in syntax and analysis phases, we enable a clean, composable, and scalable language infrastructure that supports both growth and correctness.

Advertisements

Responsive Counter
General Counter
1001138
Daily Counter
338