Article by Ayman Alheraki in December 3 2024 12:57 PM
In this article, we will dive deeper into raw image processing in C++ without relying on external libraries. We will focus on loading and processing BMP (Bitmap) and PNG image formats. BMP is a relatively simple, uncompressed format, while PNG (Portable Network Graphics) is a more complex format with lossless compression. We will discuss how to load both formats, access pixel data, manipulate it, and display it.
Raw Image Data: Images are typically represented as a matrix of pixels, each containing values like Red, Green, and Blue (RGB). Our goal is to manipulate these pixel values directly.
BMP Format: BMP files have a simple structure and store pixel data uncompressed or with minimal compression.
PNG Format: PNG is a compressed image format. While it uses lossless compression, it is more complex to handle than BMP, as we need to decode the image to access the pixel data.
We will explain how to load, process, and manipulate images from both formats.
As covered earlier, the BMP format has the following components:
Bitmap Header: Contains metadata about the image (e.g., dimensions, color depth, and file size).
DIB Header: Contains additional information about the image (e.g., width, height, and compression).
Pixel Data: The actual image data, where each pixel is represented by three bytes (RGB).
We have already discussed how to load a BMP image in the previous example. Here is a recap of the code for loading a BMP image:
// Ensure that structure members are not padded
// Define BMP Header Structure
struct BMPHeader {
char header[2]; // 'BM' for Bitmap
uint32_t fileSize; // Size of the entire file
uint32_t reserved; // Reserved (usually 0)
uint32_t dataOffset; // Offset to the pixel data
};
struct DIBHeader {
uint32_t headerSize; // DIB Header size (40 bytes)
int32_t width; // Image width
int32_t height; // Image height
uint16_t colorPlanes; // Color planes (usually 1)
uint16_t bitsPerPixel; // Bits per pixel (24 for RGB)
uint32_t compression; // Compression type (0 for none)
uint32_t imageSize; // Image size (may be 0 for uncompressed)
int32_t xRes; // Horizontal resolution (pixels per meter)
int32_t yRes; // Vertical resolution (pixels per meter)
uint32_t colors; // Number of colors
uint32_t importantColors; // Important colors
};
// Define a pixel structure for RGB
struct RGBPixel {
uint8_t blue;
uint8_t green;
uint8_t red;
};
// Restore previous packing alignment
void loadBMP(const std::string& filename) {
std::ifstream file(filename, std::ios::binary);
if (!file.is_open()) {
std::cerr << "Error opening file: " << filename << std::endl;
return;
}
BMPHeader bmpHeader;
DIBHeader dibHeader;
// Read BMP Header
file.read(reinterpret_cast<char*>(&bmpHeader), sizeof(bmpHeader));
if (bmpHeader.header[0] != 'B' || bmpHeader.header[1] != 'M') {
std::cerr << "Not a valid BMP file" << std::endl;
return;
}
// Read DIB Header
file.read(reinterpret_cast<char*>(&dibHeader), sizeof(dibHeader));
// Check if the image is 24-bit
if (dibHeader.bitsPerPixel != 24) {
std::cerr << "Only 24-bit BMP files are supported." << std::endl;
return;
}
// Move file pointer to pixel data location
file.seekg(bmpHeader.dataOffset, std::ios::beg);
// Prepare to read pixel data
std::vector<RGBPixel> pixels(dibHeader.width * dibHeader.height);
for (int y = dibHeader.height - 1; y >= 0; --y) {
for (int x = 0; x < dibHeader.width; ++x) {
file.read(reinterpret_cast<char*>(&pixels[y * dibHeader.width + x]), sizeof(RGBPixel));
}
}
// Now we have the pixel data in the `pixels` array.
// Let's print out some pixel values (RGB)
for (int y = 0; y < 10 && y < dibHeader.height; ++y) { // Limit to first 10 rows
for (int x = 0; x < 10 && x < dibHeader.width; ++x) { // Limit to first 10 columns
RGBPixel& pixel = pixels[y * dibHeader.width + x];
std::cout << "Pixel at (" << x << "," << y << ") - R: " << (int)pixel.red
<< " G: " << (int)pixel.green << " B: " << (int)pixel.blue << std::endl;
}
}
file.close();
}
int main() {
std::string filename = "sample.bmp"; // Replace with your BMP file path
loadBMP(filename);
return 0;
}
This code reads the BMP file, extracts the pixel data, and stores it in a vector of RGBPixel
structures. The pixel data is then available for manipulation.
The PNG format is more complicated due to its compression, but we can still explain its general structure:
Signature: The file begins with an 8-byte signature indicating it’s a PNG file.
Chunks: PNG files are made up of a series of chunks, each of which serves a specific purpose (e.g., the IHDR
chunk for header information, the IDAT
chunk for image data, and the IEND
chunk indicating the end of the file).
Compression: PNG uses the DEFLATE compression algorithm to compress the image data. Each pixel is represented in a compressed format that needs to be decompressed before access.
Because PNG files are compressed, we cannot directly access pixel data as we did with BMP files. Instead, we need to decode the PNG file first.
Although we cannot fully decode PNG images without an external library like libpng or zlib, it is possible to write a basic PNG decoder or find a minimalistic library that implements the PNG format. However, decoding PNG manually requires dealing with zlib-compressed image data, which is not trivial.
For this reason, we will show how to handle PNG images using stb_image.h, a very lightweight header-only library for image loading, that allows us to load PNG, BMP, and other image formats in a simple and efficient way.
If you truly want to avoid external libraries, the only way to fully implement PNG decoding from scratch would require you to handle:
Reading the file's chunks.
Decompressing the IDAT chunk using the DEFLATE algorithm.
Converting the resulting data into raw pixel values.
For simplicity, this approach requires significant code and is quite complex. Therefore, for practical purposes, let's proceed with a minimal external library approach: stb_image.
Here's how you can load a PNG file using stb_image.h, which is a header-only library that simplifies image loading without requiring complex external dependencies.
Download stb_image.h: First, download the header file from stb's official repository or use the following simple method to include it directly in your project.
Example Code to Load and Display PNG:
xxxxxxxxxx
struct RGBPixel {
uint8_t red;
uint8_t green;
uint8_t blue;
};
// Function to load a PNG image
void loadPNG(const std::string& filename) {
int width, height, channels;
// Load the image using stb_image
uint8_t* imageData = stbi_load(filename.c_str(), &width, &height, &channels, 3);
if (imageData == nullptr) {
std::cerr << "Error loading image: " << filename << std::endl;
return;
}
std::vector<RGBPixel> pixels(width * height);
// Convert imageData to RGBPixel format
for (int y = 0; y < height; ++y) {
for (int x = 0; x < width; ++x) {
int index = (y * width + x) * 3;
RGBPixel& pixel = pixels[y * width + x];
pixel.red = imageData[index];
pixel.green = imageData[index + 1];
pixel.blue = imageData[index + 2];
}
}
// Now you can manipulate the pixel data as needed
// For demonstration, let's print the first few pixels
for (int y = 0; y < 10 && y < height; ++y) {
for (int x = 0; x < 10 && x < width; ++x) {
RGBPixel& pixel = pixels[y * width + x];
std::cout << "Pixel at (" << x << "," << y << ") - R: " << (int)pixel.red
<< " G: " << (int)pixel.green << " B: " << (int)pixel.blue << std::endl;
}
}
// Free the loaded image data
stbi_image_free(imageData);
}
int main() {
std::string filename = "sample.png"; // Replace with your PNG file path
loadPNG(filename);
return 0;
}
stb_image.h: This header file allows you to load PNG and other image formats easily. It supports automatic handling of compression and formats like PNG, BMP, JPEG, etc.
stbi_load: This function loads the image, decompresses it, and stores it in an array. It returns raw pixel data.
Pixel Data: After loading the image, we extract the raw pixel data and store it as a vector of RGBPixel
structures.
Display Pixels: We print the first few pixels to verify that we’ve loaded the image correctly.
Once we have the pixel data, we can perform various image processing tasks. For example, we can invert the image colors:
for (auto& pixel : pixels) {
pixel.red = 255 - pixel.red;
pixel.green = 255 - pixel.green;
pixel.blue = 255 - pixel.blue;
}
This simple operation inverts the colors of the image.
In this expanded article, we have explored both BMP and PNG image formats in C++ and how to handle them without using complex external libraries (except for the minimal stb_image.h
for PNG decoding). We showed how to load both BMP and PNG images, access their pixel data, and manipulate it.
While raw image processing for BMP is straightforward, handling PNG is more complex due to compression. We simplified the PNG decoding using stb_image.h
, which enables easy access to pixel data without needing to implement complex image decoding algorithms manually.
If you're working with more sophisticated image formats or need additional functionality, libraries like libpng or OpenCV would be better suited, but the principles shown here provide a foundation for handling raw image data in C++.