#7 Mastering GAS A Complete Guide to the GNU Assembler

Article by Ayman Alheraki on January 11 2026 10:37 AM

#7 Mastering GAS: A Complete Guide to the GNU Assembler

Series for explaining and teaching GNU GAS Assembler using AT&T syntax – all codes are reviewed and tested daily on

Fedora Linux 42

GNU Assembler version 2.44-6

Basics of Assembling Code with GAS

Overview of the Assembly Process

In the world of low-level programming, understanding the assembly process is foundational to mastering systems programming, embedded software development, and performance optimization. Assembly language is as close to the hardware as you can get while still writing in a human-readable form, offering fine control over the hardware and system resources.

However, since assembly code is not directly executable by the processor, it must undergo a series of transformations before becoming a running program.

The assembly process involves several distinct stages, each playing a crucial role in converting human-readable assembly code into machine code, which can then be executed by the CPU. The core of the assembly process is the translation of assembly instructions (mnemonics) into machine-readable code (binary). These instructions are executed directly by the hardware, making the assembly process one of the most important tasks in systems programming.

The assembly process typically involves the following stages:

Writing Assembly Code
Preprocessing (optional)
Assembly (or Assembling the Code)
Linking
Running the Executable

Each of these stages plays a crucial role in transforming assembly code into an executable program. In this section, we will dive deeper into these steps, their importance, and how each step fits into the overall process, along with a brief look at the tools involved in each step.

Writing Assembly Code

The first step in the assembly process is writing the source code in assembly language. This step is performed by the developer, who writes instructions using mnemonics—human- readable representations of machine instructions. Assembly language provides a low-level interface to the hardware, with each instruction representing an operation that directly corresponds to a machine language instruction specific to a given processor architecture.

Key Aspects of Writing Assembly Code:

Mnemonics: Mnemonics are short, easy-to-remember representations of machine-level instructions. For example, the mnemonic MOV might represent the operation to move data between registers. Similarly, ADD represents an addition operation.
Registers: Assembly code often makes use of registers, which are small, fast storage locations within the CPU. For example, a typical instruction like MOV R0, 5 could move the value 5 into the register R0.
Labels: Labels act as placeholders for specific addresses in the code. They allow for jumps, branches, or function calls. For instance, loop: might be a label that marks the start of a loop, allowing a jump instruction to return control to that point in the code.

Directives: Directives are special instructions for the assembler itself, not for the CPU. These provide the assembler with information about how to process the code. For example, the .text directive indicates that the following code is executable code, and .data indicates that the following section is for data storage.

Comments: While assembly language is lower-level than high-level languages, comments can still be used to explain the code. These comments are ignored by the assembler and do not affect the machine code but help make the code more readable and maintainable.

An example of a simple assembly code snippet might look like this:


x
    .data
message: .asciz "Hello, World!"
    .text
    .globl _start

_start:
mov $4, %eax            # syscall for sys_write
mov $1, %ebx            # file descriptor (stdout)
mov $message, %ecx      # message address
mov $13, %edx           # message length
int $0x80               # invoke syscall

mov $1, %eax            # syscall for sys_exit
xor %ebx, %ebx          # exit stats 0
int $0x80               # invoke syscall

In this example, the code consists of data and executable sections, uses a few system calls (syscalls) to output a message to the screen, and then exits the program.

Preprocessing (Optional)

In some cases, especially when working with macros or conditional assembly, preprocessing may be required. Preprocessing refers to the process of transforming the source code before it is passed to the assembler. It typically involves expanding macros, including other files, and performing any necessary conditional assembly based on the target system or compilation settings.

While preprocessing is essential in higher-level languages (such as C or C++) for handling things like header files and macro expansion, it is sometimes used in assembly language to manage code reusability, reduce repetition, and improve portability.

Preprocessing Tasks:

Macro Expansion: A macro is a set of assembly instructions that can be reused throughout the code. The preprocessor expands the macros into the actual machine instructions before the code reaches the assembler. For example, the preprocessor could replace every occurrence of PRINT with a sequence of assembly instructions for displaying text.
Conditional Assembly: In some cases, it’s necessary to include or exclude portions of code based on specific conditions. For example, different code might be required depending on whether the target machine is a 32-bit or 64-bit system. Preprocessing allows developers to write code that can be conditionally included based on macros or platform-specific definitions.

An example of a macro expansion might look like this:


xxxxxxxxxx
.macro PRINT msg, len
    mov $4, %eax
    mov $1, %ebx
    mov $\msg,  %ecx
    mov $\len,  %edx
    int $0x80
.endm


Print "Hello, World!" 13

The PRINT macro would be expanded during preprocessing to the actual assembly code that prints ”Hello, World!” to the screen.

Preprocessing is an optional but useful step, particularly in larger projects where code reuse and modularity are important.

Assembling the Code (Assembly Step)

Once the assembly source code has been written (and optionally preprocessed), the next step is the assembly phase, where the assembler translates the human-readable assembly code into machine-readable binary code. This is the core function of GAS (GNU Assembler).

Tasks Performed by the Assembler:

Syntax Parsing: The assembler reads the assembly code and interprets the mnemonics and directives, mapping them to corresponding machine instructions. For example, MOV would be replaced with the binary instruction for moving data to a register.
Object File Generation: The assembler generates an object file (.o or .obj), which contains machine code (also known as object code). This object file is still not an executable and may contain references to external functions or symbols that are resolved during the linking phase. The object file also contains information about memory locations for code and data.
Symbol Resolution: The assembler also resolves symbols—such as labels, variables, and functions—by assigning them addresses. For example, the label start: is replaced by an address in the object file where the code corresponding to start: resides.

Handling Sections: The assembly code is divided into sections such as .text (code), .data (data), and .bss (uninitialized data). Each section of the program is placed in the appropriate area of memory when the program is loaded.

For instance, the command to assemble the code using GAS might look like this:


as -o output.o input.s

Here, input.s is the assembly source file, and output.o is the object file generated by GAS.

Linking

After assembly, the next phase is linking, where the linker combines one or more object files into a complete executable. The linker resolves any external references to symbols (such

as function calls and global variables) and ensures that addresses in the code and data are properly assigned.

Tasks Performed by the Linker:

Symbol Resolution: If the object files reference functions or variables defined elsewhere (in other object files or libraries), the linker resolves these references, ensuring that the final executable knows where to find those symbols.
Address Binding: The linker assigns memory addresses to all functions and variables. This includes both code and data sections. If the program uses libraries (like system libraries), the linker ensures that the library code is included and correctly referenced.
Relocation: If the code and data sections need to be moved in memory, the linker will adjust memory addresses accordingly. This ensures that the program can be loaded at different memory locations without conflict.

Combining Object Files: The linker can combine multiple object files into a single executable. For example, if your program consists of separate source files that are assembled into multiple object files, the linker will combine them into a single binary.

The linking process can produce an executable file that is ready for execution. A typical command to link the object files into an executable might look like this:


ld -o program output.o

Here, output.o is the object file generated by the assembler, and program is the final executable.

Running the Executable

Once the program has been linked, the final step is to execute it. The operating system loads the executable file into memory, resolves any dynamic library dependencies, and begins executing the program at its entry point (usually the main function in higher-level languages, or the entry point defined by the assembler in lower-level systems).

Tasks During Execution:

Loading into Memory: The operating system loads the program into RAM, placing the code and data into their respective locations.
System Call Invocation: During execution, the program might make system calls (e.g., to write to the screen, allocate memory, or interact with other programs). These calls are handled by the operating system.
Execution of Machine Code: The CPU reads and executes the machine code instructions line-by-line, following the control flow defined in the assembly code.

Once the program finishes execution, control returns to the operating system, and resources used by the program are freed.

Tools Involved in the Assembly Process

The assembly process involves several tools that work together to transform assembly code into an executable:

Text Editor: Used for writing the source code. Examples include editors like Vim, Emacs, or IDEs such as VS Code or Sublime Text.
Preprocessor (optional): Expands macros and handles conditional assembly.

Assembler (as for GAS): Converts assembly code into object files.

Linker (ld for GNU): Combines object files into a single executable.

Debugger (optional): Used to debug the program by stepping through machine code or assembly code (e.g., GDB).
Compiler (optional): In some cases, such as when dealing with higher-level languages, the compiler (like GCC) will invoke the assembler to translate assembly code into machine code.

Conclusion

The assembly process is a crucial part of low-level programming, particularly for systems programming, embedded systems, and performance-critical applications. It involves multiple steps—from writing assembly code, preprocessing, assembling the code, linking, to running the final executable. Each step plays a key role in transforming human-readable assembly into machine code that can be executed by the processor. Understanding this process allows developers to gain deep insights into how software interacts with hardware and how to optimize software for specific hardware platforms. Mastery of the assembly process is an essential skill for any programmer working with low-level languages.