Article by Ayman Alheraki on May 12 2026 12:59 PM
Example instruction:
xxxxxxxxxxmov r10, [r8 + r9*4 + 16]
Machine code (conceptual layout):
xxxxxxxxxxREX OPCODE ModR/M SIB DISP4C 8B 54 88 10
xxxxxxxxxxROOT: RAW INSTRUCTION BYTES│├── PREFIX PARSE STAGE│ ││ ├── Legacy Prefixes (optional)│ │ ├─ F0 (LOCK)│ │ ├─ F2 (REPNE)│ │ ├─ F3 (REP)│ ││ ├── Segment Override (optional)│ │ ├─ CS / DS / ES / FS / GS│ ││ ├── Operand-size override (66h)│ ││ └── Address-size override (67h)│├── REX PREFIX (x86-64 ONLY)│ ││ ├── Format: 0100WRXB│ ││ ├── W → 64-bit operand size│ ├── R → extends ModR/M REG field│ ├── X → extends SIB INDEX field│ └── B → extends ModR/M R/M or SIB BASE││ └── OUTPUT:│ extended register namespace (R8–R15 enabled)│├── OPCODE DECODING│ ││ ├── Single-byte opcode (e.g. 8B)│ ├── Two-byte opcode (0F xx)│ ├── Three-byte opcode (0F 38 / 0F 3A)│ ││ └── CLASSIFICATION:│ mov, add, sub, lea, etc.│├── ModR/M BYTE PARSE│ ││ ├── Format:│ │ 7 6 | 5 4 3 | 2 1 0│ │ MOD | REG | R/M│ ││ ├── MOD FIELD│ │ 00 → memory, no displacement│ │ 01 → memory + 8-bit displacement│ │ 10 → memory + 32-bit displacement│ │ 11 → register-direct│ ││ ├── REG FIELD│ │ → destination register (or opcode extension)│ ││ └── R/M FIELD│ → register OR SIB trigger│├── SIB BYTE (IF R/M = 100 AND MOD ≠ 11)│ ││ ├── Format:│ │ SCALE | INDEX | BASE│ ││ ├── SCALE:│ │ 00 → *1│ │ 01 → *2│ │ 10 → *4│ │ 11 → *8│ ││ ├── INDEX:│ │ register index (extended by REX.X)│ ││ └── BASE:│ base register (extended by REX.B)│├── DISPLACEMENT (optional)│ ││ ├── 8-bit signed│ ├── 32-bit signed│ └── used for memory offset│├── IMMEDIATE (optional)│ ││ ├── 8-bit│ ├── 16-bit│ ├── 32-bit│ └── 64-bit│└── FINAL SEMANTIC DECODE│├── Register resolution:│ FINAL = (REX bit << 3) + 3-bit field│├── Effective Address Computation:│ [BASE + INDEX * SCALE + DISP]│└── Instruction Execution Mapping:CPU micro-ops generated
Instruction:
xxxxxxxxxxmov r10, [r8 + r9*4 + 16]
xxxxxxxxxx4C 8B 54 88 10
xxxxxxxxxxInstruction│├── Prefixes│ └── REX = 4C│ ├── W = 1 (64-bit)│ ├── R = 1│ ├── X = 0│ └── B = 0│├── Opcode│ └── 8B│ → MOV r64, r/m64│├── ModR/M = 54│ ├── MOD = 01 → memory + 8-bit displacement│ ├── REG = 010 → R10 (after REX.R)│ └── R/M = 100 → SIB required│├── SIB = 88│ ├── SCALE = 10 → *4│ ├── INDEX = 001 → R9 (no REX.X)│ └── BASE = 000 → RAX/R8? (depends REX.B)│├── Displacement = 10h│└── Effective Address→ [r8 + r9*4 + 16]
xxxxxxxxxxREGISTER FIELD RESOLUTION│├── Input: 3-bit register field│├── Check REX prefix│ ││ ├── REX.R → extends REG field│ ├── REX.X → extends INDEX field│ └── REX.B → extends BASE / R/M field│└── Final register index:(REX_bit << 3) | reg_bits
xxxxxxxxxxEFFECTIVE ADDRESS CALCULATION│├── BASE REGISTER│ └── from ModR/M or SIB.BASE│├── INDEX REGISTER│ └── from SIB.INDEX│├── SCALE FACTOR│ └── 1 / 2 / 4 / 8│├── DISPLACEMENT│ └── immediate offset│└── FINAL ADDRESS:BASE + INDEX * SCALE + DISP
xxxxxxxxxxFETCH↓PRE-DECODE↓PREFIX ANALYSIS↓REX EXPANSION UNIT↓OPCODE DECODER (ID stage)↓ModR/M PARSER↓SIB ENGINE↓REGISTER RENAMING↓MICRO-OP GENERATION↓EXECUTION UNITS
This structure exists for:
backward compatibility (x86 legacy preserved)
minimal encoding changes
efficient hardware decoding
modular extension (REX only adds bits, nothing is broken)
Think of decoding like layers:
Layer 1 → Prefixes (REX, legacy)Layer 2 → Opcode (what instruction)Layer 3 → ModR/M (who is involved)Layer 4 → SIB (how memory is computed)Layer 5 → Displacement (offset)Layer 6 → Immediate (constant value)Layer 7 → Execution (CPU micro-ops)