Architecture of an Assembler Data Structures in Assembler Design

Article by Ayman Alheraki on January 11 2026 10:37 AM

Architecture of an Assembler Data Structures in Assembler Design - Relocation Table

Architecture of an Assembler: Data Structures in Assembler Design -> Relocation Table

The relocation table is a fundamental data structure in assembler and linker design, responsible for recording references to symbols or addresses that cannot be resolved at assembly time. These relocations are deferred until the linking phase or even runtime (in dynamic linking). In the context of x86-64 assembly, this includes addresses for functions, global variables, jump targets in external modules, and addresses needing adjustment based on the final layout of code and data.

Relocation supports modularity and separation of compilation units, enabling linking across object files, shared libraries, and dynamically loaded modules.

1. Purpose of the Relocation Table

The relocation table tracks all addresses in the assembled code that:

Reference symbols not yet defined at the current point in assembly.
Depend on final address layout (e.g., section offsets, alignment padding).
Involve absolute or PC-relative addressing to symbols in other files.
Require patching after linking to insert actual addresses or offsets.

Each entry in the relocation table provides the necessary metadata for the linker to resolve the pending address and modify the binary at the appropriate location.

2. Relocation Entry Format

A relocation table is typically a list or segment of structured records. Each relocation entry includes:

Field	Description
`offset`	Location in the code or data section where relocation is needed.
`symbol`	The target symbol being referenced (can be external or internal).
`reloc_type`	Type of relocation (e.g., absolute, PC-relative, GOT, PLT).
`addend`	Optional constant added to the symbol address.
`section_index`	Section to which the relocation applies.
`size`	Size of the relocated field (e.g., 32-bit, 64-bit).

In many formats like ELF (Executable and Linkable Format), relocation entries come in two flavors:

Relocation without addend (Rel) – Value is read from the section.
Relocation with addend (Rela) – Addend is stored explicitly.

Assemblers designed after 2020 often opt for Rela-style entries to improve precision and decouple symbol value encoding from section data.

3. Common Relocation Types (x86-64 Specific)

Assemblers targeting x86-64 architecture need to support several relocation types, each with specific semantics. Some commonly used types include:

R_X86_64_64: 64-bit absolute address.
R_X86_64_PC32: 32-bit PC-relative offset.
R_X86_64_PLT32: 32-bit PC-relative call to a symbol through the Procedure Linkage Table.
R_X86_64_GOTPCREL: 32-bit offset to symbol in the Global Offset Table.
R_X86_64_32 / R_X86_64_32S: Absolute 32-bit address (signed or unsigned).

Assemblers must correctly emit the appropriate type based on instruction format and symbol visibility. For example, a CALL to an external function often requires a PLT32 relocation, while a reference to a global variable uses GOTPCREL when Position-Independent Code (PIC) is enabled.

4. Relocation in Position-Independent Code (PIC)

In modern systems, especially those using shared libraries or ASLR (Address Space Layout Randomization), code must be position-independent. This implies:

Avoiding absolute addresses embedded in the code.
Using relative offsets or table lookups (GOT/PLT).
Emitting relocations that the dynamic linker can resolve at load time.

Relocation entries in such cases must reflect this model by using types like GOTPCREL, PLT32, or platform-specific variants. The assembler determines this based on flags (e.g., -fPIC) and ensures that the generated code and relocation table match the target environment.

5. Relocation Table Generation Process

The relocation table is populated during code generation when the assembler detects unresolved symbol references. The steps include:

Encounter Reference: The assembler processes an instruction or directive that refers to a symbol.
Check Symbol Definition: If the symbol is undefined or external, relocation is required.
Emit Placeholder: The assembler emits a placeholder (e.g., zero or addend) in the binary code.
Create Relocation Entry: A relocation entry is created with all required metadata.
Append to Relocation Table: The entry is stored in the appropriate relocation section (e.g., .rela.text for code references).

6. Section-Specific Relocation Tables

Relocations are generally segmented by the section they apply to. For instance:

.rela.text – Relocations within the .text section (code).
.rela.data – Relocations within the .data section (initialized data).
.rela.rodata – Read-only data section references.

This separation simplifies linker processing and allows fine-grained resolution across segments.

7. Optimizations and Trends Since 2020

Modern assembler designs have adopted several improvements in handling relocation data:

Compact relocation entries: Eliminate redundancy by using symbolic encoding schemes.
Lazy relocation emission: Defer emission until final pass to avoid unnecessary entries.
Cross-reference folding: Combine multiple relocations to the same symbol into a single dynamic resolution.
ELF alternative formats: Use packed or compressed relocation tables in specialized toolchains.
Relocation caching: For Just-In-Time (JIT) compilers, cache relocation templates for reuse across code blocks.

These optimizations increase speed and reduce binary size, especially in environments with thousands of small relocatable references (e.g., dynamic dispatch tables or template-heavy code in C++).

8. Relocation Table and Linking Interface

The relocation table is integral to interoperability with linkers. It acts as the handshake mechanism through which the assembler communicates unresolved references to the linker. The assembler must ensure:

Symbol names are consistent and match external definitions.
Relocation types comply with the target format (ELF, COFF, Mach-O).
Addends and placeholder encoding are correct and match linker expectations.

In advanced toolchains, assemblers may also output symbol visibility (local, global, weak) and relocation priorities for optimization during link-time optimization (LTO).

9. Summary

The relocation table is an indispensable part of modern assembler architecture, enabling modular code generation, delayed address resolution, and dynamic linking. Its design must be flexible, precise, and efficient, especially in the context of complex x86-64 instruction encoding, position-independent code requirements, and interoperability with modern linkers and loaders. For an assembler to function correctly and reliably in contemporary software systems, the relocation table must be meticulously constructed and fully aligned with evolving platform standards.