Logo
Articles Compilers Libraries Books MiniBooklets Assembly C++ Linux Others Videos
Advertisement

Article by Ayman Alheraki on January 11 2026 10:37 AM

#10 Foundation and Architecture Language Implementation Project Structure - Organizing Target Language Files (.lang,

#10 Foundation and Architecture: Language Implementation Project Structure -> Organizing Target Language Files (.lang, .test)

In the design and implementation of a new programming language, managing the source files written in that language is as important as implementing its components. These source files—used for testing, demonstration, validation, and bootstrapping—should follow a clear structure and naming convention that reflects the language’s evolution, helps automate test suites, and supports maintainability as the language expands.

This section explains how to organize, name, and integrate target language files, using extensions such as .lang for programs and .test for validation scripts, into the interpreter infrastructure. The approach follows modern software practices from the past five years and complements C++20/23-powered interpreter architecture.

1. Purpose of Organizing Language Files

When building a new language, you will be writing dozens to hundreds of test programs, examples, and behavioral specifications in your target language. These files serve multiple purposes:

  • Unit testing the interpreter or compiler

  • Behavioral documentation for language features

  • Regression tests to prevent breaking existing features

  • End-user examples for tutorials and learning

  • Interactive REPL experiments or integration scripts

A disciplined file structure with dedicated extensions allows these files to be managed, loaded, interpreted, and tested automatically.

2. Suggested File Extensions

To differentiate target-language files from C++ implementation files:

  • .lang — for source files written in the new language

    • Used for examples, programs, benchmarks, REPL inputs

  • .test — for test scripts containing expected results

    • Paired with .lang files for output-based assertions

  • .fail — for negative tests that must produce compile-time or runtime errors

    • Used to verify robustness and error diagnostics

This naming convention makes it possible to auto-discover and categorize files by purpose.

3. Directory Layout for Target Language Files

The following is a clean and scalable directory structure:

Each test or example can be read independently by tools or manually executed using the interpreter executable.

4. .lang File Design Conventions

A .lang file is a complete or partial program written in your custom C-style language. During early development, programs will likely demonstrate:

  • Arithmetic and logical expressions

  • Function definitions and calls

  • Variable binding and scoping

  • Conditionals and loops

Example — factorial.lang:

Conventions:

  • Use consistent indentation and spacing

  • Use comments (//) for expected behavior or notes

  • Keep filenames and function names descriptive and lowercase with underscores


5. .test File Format and Structure

A .test file serves both as a program and an assertion document. It includes embedded expectations. A simple format could be:

A test harness in C++ parses this file, extracts the input program and the expected output, runs the interpreter, and compares actual output with expectations.

This format allows test automation without a separate database or metadata schema.

Features to support:

  • // EXPECT: for single-line outputs

  • // ERROR: for failure cases

  • Optional // EXIT: for expected return code

6. .fail File Format

Negative tests ensure that invalid programs are rejected. These are useful for validating:

  • Type system enforcement

  • Syntax checking

  • Name resolution

  • Semantic constraints

Example — missing_semicolon.fail:

Interpreter test runners must capture the diagnostic and match it with the expected // ERROR section.

7. Integration with the Interpreter

To support test file automation in C++:

  • Create a small test runner tool (forge-test) in C++ that:

    • Scans .test and .fail directories

    • Parses sections into input/output pairs

    • Executes the interpreter

    • Captures stdout/stderr and return codes

    • Compares against expectations

  • Use std::filesystem (C++17/20) to traverse directories

  • Use std::ifstream, std::ostringstream for file content management

  • Use std::regex or simple line parsing for matching // markers

This keeps testing infrastructure entirely in modern C++ without requiring Python or scripting languages.

8. Benefits of Organized Language Files

Well-organized target language files provide:

  • Documentation: Each file acts as a living specification of a language feature

  • Validation: Ensures language changes do not regress existing behavior

  • Automation: Enables integration with CI/CD and test runners

  • Onboarding: New developers and contributors can explore language behavior easily

Moreover, the separation of .lang, .test, and .fail files enables modular testing and filtering.

9. Future Expansion

As the language matures, the file system can expand to support:

  • .mod.lang — for module or package system definitions

  • .repl.test — for interactive session scripts

  • .compile.test — for checking bytecode or intermediate output

  • .doc.lang — for showcasing documentation-driven development

Eventually, a package manager or module loader can use this same file structure for official standard library distribution.

Conclusion

Organizing target language files with dedicated extensions and a logical directory layout ensures that your interpreter can scale in both features and testing capabilities. Using .lang, .test, and .fail allows automated tooling, behavioral validation, and integration with the interpreter’s evolution.

This strategy reflects modern C++ development practices where infrastructure, source content, and testing are first-class citizens, maintained in parallel with implementation.

Advertisements

Responsive Counter
General Counter
1001141
Daily Counter
341