Article by Ayman Alheraki on January 11 2026 10:37 AM
In the design and implementation of a new programming language, managing the source files written in that language is as important as implementing its components. These source files—used for testing, demonstration, validation, and bootstrapping—should follow a clear structure and naming convention that reflects the language’s evolution, helps automate test suites, and supports maintainability as the language expands.
This section explains how to organize, name, and integrate target language files, using extensions such as .lang for programs and .test for validation scripts, into the interpreter infrastructure. The approach follows modern software practices from the past five years and complements C++20/23-powered interpreter architecture.
When building a new language, you will be writing dozens to hundreds of test programs, examples, and behavioral specifications in your target language. These files serve multiple purposes:
Unit testing the interpreter or compiler
Behavioral documentation for language features
Regression tests to prevent breaking existing features
End-user examples for tutorials and learning
Interactive REPL experiments or integration scripts
A disciplined file structure with dedicated extensions allows these files to be managed, loaded, interpreted, and tested automatically.
To differentiate target-language files from C++ implementation files:
.lang — for source files written in the new language
Used for examples, programs, benchmarks, REPL inputs
.test — for test scripts containing expected results
Paired with .lang files for output-based assertions
.fail — for negative tests that must produce compile-time or runtime errors
Used to verify robustness and error diagnostics
This naming convention makes it possible to auto-discover and categorize files by purpose.
The following is a clean and scalable directory structure:
x
/ForgeLang│├── /examples → Simple programs for demonstration│ ├── hello_world.lang│ ├── factorial.lang│ └── file_io.lang│├── /tests → Formal test cases│ ├── /passing│ │ ├── basic_arithmetic.test│ │ ├── recursion.test│ │ └── variables.test│ ├── /failing│ │ ├── missing_semicolon.fail│ │ └── type_mismatch.fail│ └── /edge_cases│ ├── shadowing.test│ └── large_loop.test│├── /benchmarks → Performance and stress examples│ ├── compute_pi.lang│ └── fib_iterative.langEach test or example can be read independently by tools or manually executed using the interpreter executable.
A .lang file is a complete or partial program written in your custom C-style language. During early development, programs will likely demonstrate:
Arithmetic and logical expressions
Function definitions and calls
Variable binding and scoping
Conditionals and loops
Example — factorial.lang:
x
fn factorial(n: int) -> int { if n <= 1 { return 1; } return n * factorial(n - 1);}
fn main() -> int { let value: int = 5; print("Result: ", factorial(value)); return 0;}Conventions:
Use consistent indentation and spacing
Use comments (//) for expected behavior or notes
Keep filenames and function names descriptive and lowercase with underscores
A .test file serves both as a program and an assertion document. It includes embedded expectations. A simple format could be:
x
// INPUTfn main() -> int { print("Hello, world!"); return 0;}
// EXPECTHello, world!A test harness in C++ parses this file, extracts the input program and the expected output, runs the interpreter, and compares actual output with expectations.
This format allows test automation without a separate database or metadata schema.
Features to support:
// EXPECT: for single-line outputs
// ERROR: for failure cases
Optional // EXIT: for expected return code
Negative tests ensure that invalid programs are rejected. These are useful for validating:
Type system enforcement
Syntax checking
Name resolution
Semantic constraints
Example — missing_semicolon.fail:
x
fn main() -> int { let x: int = 10 return x;}
// ERRORExpected ';' after variable declarationInterpreter test runners must capture the diagnostic and match it with the expected // ERROR section.
To support test file automation in C++:
Create a small test runner tool (forge-test) in C++ that:
Scans .test and .fail directories
Parses sections into input/output pairs
Executes the interpreter
Captures stdout/stderr and return codes
Compares against expectations
Use std::filesystem (C++17/20) to traverse directories
Use std::ifstream, std::ostringstream for file content management
Use std::regex or simple line parsing for matching // markers
This keeps testing infrastructure entirely in modern C++ without requiring Python or scripting languages.
Well-organized target language files provide:
Documentation: Each file acts as a living specification of a language feature
Validation: Ensures language changes do not regress existing behavior
Automation: Enables integration with CI/CD and test runners
Onboarding: New developers and contributors can explore language behavior easily
Moreover, the separation of .lang, .test, and .fail files enables modular testing and filtering.
As the language matures, the file system can expand to support:
.mod.lang — for module or package system definitions
.repl.test — for interactive session scripts
.compile.test — for checking bytecode or intermediate output
.doc.lang — for showcasing documentation-driven development
Eventually, a package manager or module loader can use this same file structure for official standard library distribution.
Organizing target language files with dedicated extensions and a logical directory layout ensures that your interpreter can scale in both features and testing capabilities. Using .lang, .test, and .fail allows automated tooling, behavioral validation, and integration with the interpreter’s evolution.
This strategy reflects modern C++ development practices where infrastructure, source content, and testing are first-class citizens, maintained in parallel with implementation.