#10 Foundation and Architecture Language Implementation Project Structure

Article by Ayman Alheraki on January 11 2026 10:37 AM

#10 Foundation and Architecture Language Implementation Project Structure - Organizing Target Language Files (.lang,

#10 Foundation and Architecture: Language Implementation Project Structure -> Organizing Target Language Files (.lang, .test)

In the design and implementation of a new programming language, managing the source files written in that language is as important as implementing its components. These source files—used for testing, demonstration, validation, and bootstrapping—should follow a clear structure and naming convention that reflects the language’s evolution, helps automate test suites, and supports maintainability as the language expands.

This section explains how to organize, name, and integrate target language files, using extensions such as .lang for programs and .test for validation scripts, into the interpreter infrastructure. The approach follows modern software practices from the past five years and complements C++20/23-powered interpreter architecture.

1. Purpose of Organizing Language Files

When building a new language, you will be writing dozens to hundreds of test programs, examples, and behavioral specifications in your target language. These files serve multiple purposes:

Unit testing the interpreter or compiler
Behavioral documentation for language features
Regression tests to prevent breaking existing features
End-user examples for tutorials and learning
Interactive REPL experiments or integration scripts

A disciplined file structure with dedicated extensions allows these files to be managed, loaded, interpreted, and tested automatically.

2. Suggested File Extensions

To differentiate target-language files from C++ implementation files:

.lang — for source files written in the new language
- Used for examples, programs, benchmarks, REPL inputs
.test — for test scripts containing expected results
- Paired with .lang files for output-based assertions
.fail — for negative tests that must produce compile-time or runtime errors
- Used to verify robustness and error diagnostics

This naming convention makes it possible to auto-discover and categorize files by purpose.

3. Directory Layout for Target Language Files

The following is a clean and scalable directory structure:


x
/ForgeLang
│
├── /examples               → Simple programs for demonstration
│   ├── hello_world.lang
│   ├── factorial.lang
│   └── file_io.lang
│
├── /tests                  → Formal test cases
│   ├── /passing
│   │   ├── basic_arithmetic.test
│   │   ├── recursion.test
│   │   └── variables.test
│   ├── /failing
│   │   ├── missing_semicolon.fail
│   │   └── type_mismatch.fail
│   └── /edge_cases
│       ├── shadowing.test
│       └── large_loop.test
│
├── /benchmarks             → Performance and stress examples
│   ├── compute_pi.lang
│   └── fib_iterative.lang

Each test or example can be read independently by tools or manually executed using the interpreter executable.

4. .lang File Design Conventions

A .lang file is a complete or partial program written in your custom C-style language. During early development, programs will likely demonstrate:

Arithmetic and logical expressions
Function definitions and calls
Variable binding and scoping
Conditionals and loops

Example — factorial.lang:


x
fn factorial(n: int) -> int {
    if n <= 1 {
        return 1;
    }
    return n * factorial(n - 1);
}

fn main() -> int {
    let value: int = 5;
    print("Result: ", factorial(value));
    return 0;
}

Conventions:

Use consistent indentation and spacing
Use comments (//) for expected behavior or notes
Keep filenames and function names descriptive and lowercase with underscores

5. .test File Format and Structure

A .test file serves both as a program and an assertion document. It includes embedded expectations. A simple format could be:


x
// INPUT
fn main() -> int {
    print("Hello, world!");
    return 0;
}

// EXPECT
Hello, world!

A test harness in C++ parses this file, extracts the input program and the expected output, runs the interpreter, and compares actual output with expectations.

This format allows test automation without a separate database or metadata schema.

Features to support:

// EXPECT: for single-line outputs
// ERROR: for failure cases
Optional // EXIT: for expected return code

6. .fail File Format

Negative tests ensure that invalid programs are rejected. These are useful for validating:

Type system enforcement
Syntax checking
Name resolution
Semantic constraints

Example — missing_semicolon.fail:


x
fn main() -> int {
    let x: int = 10
    return x;
}

// ERROR
Expected ';' after variable declaration

Interpreter test runners must capture the diagnostic and match it with the expected // ERROR section.

7. Integration with the Interpreter

To support test file automation in C++:

Create a small test runner tool (forge-test) in C++ that:
- Scans .test and .fail directories
- Parses sections into input/output pairs
- Executes the interpreter
- Captures stdout/stderr and return codes
- Compares against expectations
Use std::filesystem (C++17/20) to traverse directories
Use std::ifstream, std::ostringstream for file content management
Use std::regex or simple line parsing for matching // markers

This keeps testing infrastructure entirely in modern C++ without requiring Python or scripting languages.

8. Benefits of Organized Language Files

Well-organized target language files provide:

Documentation: Each file acts as a living specification of a language feature
Validation: Ensures language changes do not regress existing behavior
Automation: Enables integration with CI/CD and test runners
Onboarding: New developers and contributors can explore language behavior easily

Moreover, the separation of .lang, .test, and .fail files enables modular testing and filtering.

9. Future Expansion

As the language matures, the file system can expand to support:

.mod.lang — for module or package system definitions
.repl.test — for interactive session scripts
.compile.test — for checking bytecode or intermediate output
.doc.lang — for showcasing documentation-driven development

Eventually, a package manager or module loader can use this same file structure for official standard library distribution.

Conclusion

Organizing target language files with dedicated extensions and a logical directory layout ensures that your interpreter can scale in both features and testing capabilities. Using .lang, .test, and .fail allows automated tooling, behavioral validation, and integration with the interpreter’s evolution.

This strategy reflects modern C++ development practices where infrastructure, source content, and testing are first-class citizens, maintained in parallel with implementation.