Tutorial 4: Optional Header

We have learned about the DOS header and some members of the PE header. Here's the last, the biggest and probably the most important member of the PE header, the optional header.

To refresh your memory, the optional header is a structure that is the last member of IMAGE_NT_HEADERS. It contains information about the logical layout in the PE file. There are 31 fields in this structure. Some of them are crucial and some are not useful. I'll explain only those fields that are really useful.

There is a word that's used frequently in relation to PE file format: RVA
RVA stands for relative virtual address. You know what virtual address is. RVA is a daunting term for such a simple concept. Simply put, an RVA is a distance from a reference point in the virtual address space. I bet you're familiar with file offset: an RVA is exactly the same thing as file offset. However, it's relative to a point in virtual address space, not a file. I'll show you an example. If a PE file loads at 400000h in the virtual address (VA) space and the program starts execution at the virtual address 401000h, we can say that the program starts execution at RVA 1000h. An RVA is relative to the starting VA of the module.
Why does the PE file format use RVA? It's to help reduce the load of the PE loader. Since a module can be relocated anywhere in the virtual address space, it would be a hell for the PE loader to fix every relocatable items in the module. In contrast, if all relocatable items in the file use RVA, there is no need for the PE loader to fix anything: it simply relocates the whole module to a new starting VA. It's like the concept of relative path and absolute path: RVA is akin to relative path, VA is like absolute path.
Field Meanings
AddressOfEntryPoint It's the RVA of the first instruction that will be executed when the PE loader is ready to run the PE file. If you want to divert the flow of execution right from the start, you need to change the value in this field to a new RVA and the instruction at the new RVA will be executed first.
ImageBase It's the preferred load address for the PE file. For example, if the value in this field is 400000h, the PE loader will try to load the file into the virtual address space starting at 400000h. The word "preferred" means that the PE loader may not load the file at that address if some other module already occupied that address range.
SectionAlignment The granularity of the alignment of the sections in memory. For example, if the value in this field is 4096 (1000h), each section must start at multiples of 4096 bytes. If the first section is at 401000h and its size is 10 bytes, the next section must be at 402000h even if the address space between 401000h and 402000h will be mostly unused.
FileAlignment The granularity of the alignment of the sections in the file. For example, if the value in this field is 512 (200h), each section must start at multiples of 512 bytes. If the first section is at file offset 200h and the size is 10 bytes, the next section must be located at file offset 400h: the space between file offsets 522 and 1024 is unused/undefined.
MajorSubsystemVersion
MinorSubsystemVersion
The win32 subsystem version. If the PE file is designed for Win32, the subsystem version must be 4.0 else the dialog won't have 3-D look.
SizeOfImage The overall size of the PE image in memory. It's the sum of all headers and sections aligned to SectionAlignment.
SizeOfHeaders The size of all headers+section table. In short, this value is equal to the file size minus the combined size of all sections in the file. You can also use this value as the file offset of the first section in the PE file.
Subsystem Tell in which of the NT subsystem the PE file is intended for. For most win32 progs, only two values are used: Windows GUI and Windows CUI (console).
DataDirectory An array of IMAGE_DATA_DIRECTORY structures. Each structure gives the RVA of an important data structure in the PE file such as the import address table.