Logo
Articles Compilers Libraries Books MiniBooklets Assembly C++ Linux Others Videos
Advertisement

Article by Ayman Alheraki on January 11 2026 10:37 AM

x86-64 Instruction Set Architecture (ISA)

x86-64 Instruction Set Architecture (ISA) - Register Set Overview

 

Introduction to GPRs in x86-64

The general-purpose registers (GPRs) are the most frequently used registers in the x86-64 architecture. They serve as temporary storage for data, operands for arithmetic and logical instructions, base or index pointers for memory access, and support for calling conventions and return values. The x86-64 design, first introduced by AMD as AMD64, expands upon the earlier 32-bit x86 (IA-32) architecture by doubling the register width from 32 bits to 64 bits and increasing the number of general-purpose registers from 8 to 16.

This expansion addresses many of the limitations in register availability that affected compiler performance, calling conventions, and optimization strategies in 32-bit x86. With the broader set of GPRs, x86-64 systems support more efficient register allocation and reduce the need for memory access during function execution.

1. The Complete Set of x86-64 General-Purpose Registers

In 64-bit mode, the following 16 general-purpose registers are available:

  • RAX, RBX, RCX, RDX

  • RSI, RDI, RBP, RSP

  • R8, R9, R10, R11, R12, R13, R14, R15

Each of these registers can be accessed in multiple sizes, depending on the operation:

Register64-bit32-bit16-bit8-bit (low)8-bit (high)
RAXRAXEAXAXALAH
RBXRBXEBXBXBLBH
RCXRCXECXCXCLCH
RDXRDXEDXDXDLDH
RSIRSIESISISIL
RDIRDIEDIDIDIL
RBPRBPEBPBPBPL
RSPRSPESPSPSPL
R8–R15R8–R15R8D–R15DR8W–R15WR8B–R15B

 

Registers R8 through R15 are only available in 64-bit mode and require special instruction encoding (REX prefix).

3. Functional Roles and Conventional Uses**

Although GPRs are general-purpose by name, specific conventions and operating system ABIs assign functional roles to many of them.

Arithmetic and Data Transfer

  • RAX is the default accumulator. It's used implicitly in instructions like MUL, DIV, CPUID, and system calls (as a return value register).

  • RCX and RDX are often used for shifts, multiplication, and division. RCX is commonly used in loops.

  • RBX is typically callee-saved and preserved across function calls.

Stack and Frame Management

  • RSP (Stack Pointer) holds the current top of the stack. It's modified by PUSH, POP, CALL, RET, and other stack instructions.

  • RBP (Base Pointer) is traditionally used to anchor the current stack frame, especially in functions that manipulate local variables.

Data Movement and String Operations

  • RSI (Source Index) and RDI (Destination Index) are used in string operations (MOVSB, STOSB, etc.) and memory transfers. They also serve as function arguments.

Calling Convention and Function Arguments

Depending on the OS and compiler ABI, registers are assigned to handle function arguments and return values.

  • System V AMD64 ABI (Linux/macOS/BSD):

    • Arguments: RDI, RSI, RDX, RCX, R8, R9

    • Return Values: RAX, (secondary: RDX)

    • Caller-saved: RAX, RCX, RDX, R8R11

    • Callee-saved: RBX, RBP, R12R15

  • Microsoft x64 ABI (Windows):

    • Arguments: RCX, RDX, R8, R9

    • Return Value: RAX

    • Caller-saved: RAX, RCX, RDX, R8R11

    • Callee-saved: RBX, RBP, RDI, RSI, R12R15

Understanding and implementing these rules is critical when writing or generating calling routines in your assembler.

4. Register Encoding and the REX Prefix

The introduction of the REX prefix in x86-64 was essential to enable access to the additional registers (R8R15) and to support 64-bit operand sizes. The REX prefix is a single-byte prefix with the binary format:

Where:

  • W: Set to 1 to enable 64-bit operand size

  • R: Extends the reg field in ModR/M byte

  • X: Extends the index field in the SIB byte

  • B: Extends the base field in ModR/M or SIB byte

Example: To encode an instruction like MOV R8, RAX, the assembler must emit a REX prefix (REX.B = 1) to indicate access to R8.

Proper management of the REX prefix is one of the central responsibilities of a modern x86-64 assembler, especially for minimizing instruction size and supporting compact code generation.

5. Considerations in Register Usage for Assembler Design

An assembler targeting x86-64 must make several decisions related to GPR usage, including:

  • Instruction Size Minimization: Using legacy registers (e.g., RAXRDX) where possible avoids the need for REX prefixes, reducing code size.

  • Optimization for ABI Compatibility: Functions must preserve callee-saved registers and follow the correct argument-passing convention.

  • Operand Width Precision: The assembler must select the appropriate suffix or opcode form to match operand widths (8, 16, 32, or 64-bit).

  • Fallback Handling: Where registers are exhausted, spilling to the stack must be managed cleanly.

  • ModR/M and SIB Byte Emission: Many instructions rely on complex encoding involving register identifiers—understanding the mapping is crucial for correct instruction emission.

6. Integration of GPRs in Instruction Semantics

Each instruction in x86-64 may interact with GPRs in one or more ways:

  • Explicit Operands: Instructions directly specify registers (e.g., ADD RAX, RBX).

  • Implicit Operands: Instructions like MUL or IDIV implicitly use RAX, RDX, or others without needing to specify them.

  • Special Registers: Some GPRs are required by specific instructions or behavior, such as RSP in RET, CALL, and PUSH/POP.

The assembler must detect and interpret these behaviors during parsing and code generation to ensure semantic correctness and binary integrity.

7. Summary

General-purpose registers are the cornerstone of instruction-level computation in x86-64. Their expansion to 64-bit width and increased count in the modern architecture resolved significant limitations of earlier x86 designs. A deep understanding of GPR structure, naming, roles, and encoding is critical for building a custom assembler. Their tight integration into the calling convention, stack management, and instruction encoding schemes means that the assembler must handle GPRs with precise awareness of both hardware and software constraints.

This understanding sets the stage for implementing parsing logic, register allocation, operand sizing, and encoding in subsequent chapters of assembler design.

 

Advertisements

Responsive Counter
General Counter
1001627
Daily Counter
827