Full Instruction Encoding Examples

Article by Ayman Alheraki on January 11 2026 10:37 AM

1. Introduction

Instruction encoding lies at the heart of assembler design, converting human-readable mnemonics and operands into precise binary machine code executed by the CPU. This appendix presents comprehensive examples of full instruction encodings for the x86-64 architecture, illustrating the layered complexity of opcode prefixes, opcode bytes, ModR/M bytes, SIB bytes, displacement, and immediate fields.

The examples emphasize common instruction forms, highlighting key aspects such as REX prefix usage, operand size overrides, and addressing modes relevant to the 64-bit instruction set.

2. Anatomy of an x86-64 Instruction Encoding

An x86-64 instruction may contain the following components, ordered sequentially:

Legacy prefixes (optional): Segment overrides, operand size override (0x66), address size override (0x67), LOCK, REP/REPE/REPNE.
REX prefix (optional): A one-byte prefix (0x40–0x4F) enabling 64-bit operand size and extended registers.
Opcode bytes: One or more bytes defining the instruction.
ModR/M byte (optional): Specifies operand addressing modes and register codes.
SIB byte (optional): Specifies scale-index-base addressing.
Displacement (optional): 8-, 16-, or 32-bit signed value for memory addressing.
Immediate (optional): Literal constant operand.

3. Instruction Encoding Examples

Example 1: `MOV RAX, RBX`

Assembly: mov rax, rbx
Description: Move the 64-bit value from RBX to RAX.
Encoding steps:
- No legacy prefix needed.
- REX prefix to specify 64-bit operand size and registers (REX.W=1).
- Opcode: 0x89 (MOV r/m64, r64).
- ModR/M byte encodes register-to-register addressing.
Encoding bytes (hex): 48 89 D8
Breakdown:
- 48: REX prefix (01001000) — W=1 (64-bit), R=0, X=0, B=0
- 89: Opcode for MOV r/m64, r64
- D8: ModR/M byte (11 011 000) — Mod=11 (register), Reg=011 (RBX), R/M=000 (RAX)

Example 2: `ADD BYTE PTR [RCX+4], 0x12`

Assembly: add byte ptr [rcx+4], 0x12
Description: Add immediate 8-bit value 0x12 to the byte at memory address RCX + 4.
Encoding steps:
- No legacy prefix (byte size implied by opcode).
- No REX prefix needed (default 8-bit).
- Opcode: 0x80 (ADD r/m8, imm8).
- ModR/M byte for memory operand with displacement.
- 8-bit displacement follows.
Encoding bytes (hex): 80 41 04 12
Breakdown:
- 80: Opcode for ADD r/m8, imm8
- 41: ModR/M byte (01 000 001) — Mod=01 (8-bit disp), Reg=000 (ADD), R/M=001 (RCX)
- 04: 8-bit displacement (4)
- 12: Immediate 8-bit value (0x12)

Example 3: `LEA RDX, [RAX + RBX*4 + 0x10]`

Assembly: lea rdx, [rax + rbx*4 + 0x10]
Description: Load effective address into RDX.
Encoding steps:
- REX prefix required (64-bit operand, extended registers).
- Opcode: 0x8D (LEA r64, m).
- ModR/M and SIB bytes specify addressing.
- 32-bit displacement follows.
Encoding bytes (hex): 48 8D 54 98 10
Breakdown:
- 48: REX prefix (W=1)
- 8D: LEA opcode
- 54: ModR/M byte (01 010 100) — Mod=01 (8-bit disp), Reg=010 (RDX), R/M=100 (SIB follows)
- 98: SIB byte (10 011 000) — Scale=2 (×4), Index=011 (RBX), Base=000 (RAX)
- 10: 8-bit displacement (0x10)

Example 4: `JMP 0x12345678`

Assembly: jmp 0x12345678
Description: Jump to absolute 32-bit relative address.
Encoding steps:
- Opcode: 0xE9 (JMP rel32).
- 32-bit relative displacement follows.
Encoding bytes (hex): E9 78 56 34 12
Breakdown:
- E9: JMP opcode with 32-bit relative displacement
- 78 56 34 12: 32-bit little-endian displacement (0x12345678)

4. Notes on Prefix Usage

The REX prefix is mandatory to access extended registers (R8–R15) and 64-bit operand size.
Legacy prefixes such as 0x66 modify operand size (16-bit override) and are used alongside REX where applicable.
The assembler must carefully encode prefixes to avoid illegal or redundant sequences.

5. Complex Addressing Modes

Examples involving scaled index addressing require proper SIB byte calculation, with:

Scale field: 00=×1, 01=×2, 10=×4, 11=×8
Index and Base fields encoding registers or special values (e.g., no index).

Assemblers must implement logic to encode SIB bytes correctly for complex addressing, considering displacement sizes.

6. Immediate and Displacement Encoding

Immediate values vary by instruction; 8-bit, 16-bit, 32-bit, or 64-bit sizes may apply.
Displacement fields similarly vary; the assembler must infer the minimal encoding size to optimize instruction length.
Sign-extension and proper endianness (little-endian) are crucial.

7. Instruction Length and Alignment

Instruction lengths are variable; minimal encoding is preferred to reduce code size.
Some instructions allow multiple valid encodings; assemblers may prioritize shorter forms or forms better suited to runtime constraints.

8. Summary

This appendix has presented detailed real-world examples of instruction encodings in x86-64, highlighting how mnemonic instructions translate to machine code byte sequences. Mastery of these examples aids assembler developers in implementing accurate encoding routines, essential for producing correct and optimized machine code.