Tutorial 3: File Header

In this tutorial, we will study the file header portion of the PE header.

Let's summarize what we have learned so far:

  • DOS MZ header is called IMAGE_DOS_HEADER. Only two of its members are important to us: e_magic which contains the string "MZ" and e_lfanew which contains the file offset of the PE header.
  • We use the value in e_magic to check if the file has a valid DOS header by comparing it to the value IMAGE_DOS_SIGNATURE. If both values match, we can assume that the file has a valid DOS header.
  • In order to go to the PE header, we must move the file pointer to the offset specified by the value in e_lfanew.
  • The first dword of the PE header should contain the string "PE" followed by two zeroes. We compare the value in this dword to the value IMAGE_NT_SIGNATURE. If they match, then we can assume that the PE header is valid.

We will learn more about the PE header in this tutorial. The official name of the PE header is IMAGE_NT_HEADERS. To refresh your memory, I show it below.

    Signature dd ?
    FileHeader IMAGE_FILE_HEADER <>
    OptionalHeader IMAGE_OPTIONAL_HEADER32 <>

Signature is the PE signature, "PE" followed by two zeroes. You already know and use this member.
FileHeader is a structure that contains the information about the physical layout/properies of the PE file in general.
OptionalHeader is also a structure that contains the information about the logical layout inside the PE file.

The most interesting information is in OptionalHeader. However, some fields in FileHeader are also important. We will learn about FileHeader in this tutorial so we can move to study OptionalHeader in the next tutorials.

    Machine WORD ?
    NumberOfSections WORD ?
     TimeDateStamp dd ?
    PointerToSymbolTable dd ?
    NumberOfSymbols dd ?
    SizeOfOptionalHeader WORD ?
    Characteristics WORD ?
Field name Meanings
Machine The CPU platform the file is intended for. For Intel platform, the value is IMAGE_FILE_MACHINE_I386 (14Ch). I tried to use 14Dh and 14Eh as stated in the pe.txt by LUEVELSMEYER but Windows refused to run it. This field is rarely of interest to us except as a quick way of preventing a program to be executed.
NumberOfSections The number of sections in the file. We will need to modify the value in this member if we add or delete a section from the file.
TimeDateStamp The date and time the file is created. Not useful to us.
PointerToSymbolTable used for debugging.
NumberOfSymbols used for debugging.
SizeOfOptionalHeader The size of the OptionalHeader member that immediately follows this structure. Must be set to a valid value.
Characteristics Contains flags for the file, such as whether this file is an exe or a dll.

In summary, only three members are somewhat useful to us: Machine, NumberOfSections and Characteristics. You would normally not change the values of Machine and Characteristics but you must use the value in NumberOfSections when you're walking the section table.
I'm jumping the gun here but in order to illustrate the use of NumberOfSections, I need to digress briefly to the section table.

The section table is an array of structures. Each structure contains the information of a section. Thus if there are 3 sections, there will be 3 members in this array. You need the value in NumberOfSections so you know how many members there are in the array. You would think that checking for the structure with all zeroes in its members would help. Windows does use this approach. You can verify this fact by setting the value in NumberOfSections to a value higher than the real value and Windows still runs the file without problem. From my observation, I think Windows reads the value in NumberOfSections and examines each structure in the section table. If it finds a structure that contains all zeroes, it terminates the search. Else it would process until the number of structures specified in NumberOfSections is met. Why can't we ignore the value in NumberOfSections? Several reasons. The PE specification doesn't specify that the section table array must end with an all-zero structure. Thus there may be a situation where the last array member is contiguous to the first section, without empty space at all. Another reason has to do with bound imports. The new-style binding puts the information immediately following the section table's last structure array member. Thus you still need NumberOfSections.