I had earlier jotted down a brief description of compilation, linking and loading on recent versions of Linux here. It's worth to see briefly what the situation is on Windows, in comparison. This article presupposes that you have either read this or the source materials listed there.
PE Format
As against to ELF format, windows uses PE format (for both programs and DLLs). This format has evolved since Win 95 days. PE and ELF essentially have a large feature set in common.
There are separate import and export sections (.edata and .idata) in DLL files. This section is used for symbol look up. However on Linux, information on all non-hidden symbols are available in a single symbols section.
Preferred Load Address
Windows compilers generate binaries assuming a certain load address. This is not a problem for an exe as the entire virtual address space is available at the time of exe load. But for a DLL this is not the case and the preferred load address may not be available. If load address changes, the loader takes care of adding the 'correction offset' to each and every location in the instructions referencing addresses of data/functions.
If the preferred load address is available in logical address space, there is nothing to relocate and this process can be quite fast in average case.
There is nothing similar on Linux.
Import Directory Table, Import Lookup Table & Import Address Table (IAT)
There will be one of each of these tables per imported DLL. A directory table has info on the imported DLL plus, points to the lookup and address tables.
Lookup table and Address tables are identical before the corresponding DLL is loaded. Both have arrays of entries which contain RVA (relative virtual address) pointing to the name of the referenced symbol + ordinal. When the DLL is loaded, the Address table entries are overwritten with address of the imported symbols by the loader. (The IAT tables are located in .idata section in the binary and that's how the loader locates these tables).
Note that this involves going over the Export tables entries in the imported DLL, matching names (the export tables are sorted by symbol names so that binary search can be used), finding the ordinal and fetching the address for each. This process is costly and may be done off-line with the assumption of a preferred load address - in a process called 'Binding'.
Calling/Accessing a function/data would thus translate to an indirection from the IAT table.
There are two ways in which an instruction can access IAT. That is, there are two ways to generate code.
This is close to GOT/PLT used on Linux.
Accessing Imported Symbols
The instruction that wishes to use an imported symbol would naturally have to have the address of an IAT entry. This is done in one of two ways:
a. Import Library: This library provides stubs which satisfy the missing imported symbol references. The generated stub has a JMP instruction which has an address constant pointing to an entry in IAT. The instruction would get the contents of that address and jump to it.
CALL 0x0040100C
. . .
0x0040100C:
JMP DWORD PTR [0x00405030]
Although inefficient, this is the method that is used by default by the compiler. The reason is that the compiler can't distinguish between a call to a normal function as opposed to an imported one.
b. Direct addressing:
CALL DWORD PTR [0x00405030]
The address constant above points to an entry in IAT. This is more efficient compared to using stub function. You can direct the compiler generate this type of code by prefixing __dllspec(dllimport)
There is nothing equivalent to import library method on Linux.
Binding
An executable is said to be bound when the IAT table in the file is overwritten with the actual addresses of the symbols in other DLLs (of course, assuming DLL is loaded in preferred address). This can result in a performance boost depending on how large the imported DLL is. For large DLLs with thousands of APIs, this makes sense.
If load address is different, the loader will fix up the addresses. The loader will also have timestamp and checksum of the DLL, so that when either of them change, binding is invalidated and redone.
Binding is sometimes done during installation.
There is nothing equivalent to Binding on Linux.
Explicit and Implicit linking
Implicit linking is the default method. Explicit linking is explicitly making sure that the target DLL is loaded and then looking up the address of the APIs. This is almost always done via the LoadLibrary and GetProcAddress APIs.
Ordinals
Ordinals are numbers which uniquely point to imported/exported symbols. They also form (directly or indirectly) form indices to the actual entries in the import/export tables, which contain more information on the symbols, like name, RVA.
A program can directly access an imported API via explicit linking. But this should only be done if it is guaranteed that a given symbol occurs in that ordinal always. Otherwise, in explicit linking, one should use GetProcAddress to lookup symbols by name. Ordinals are sometimes used in plugin systems to locate a factory method, to instantiate the core class, and the plugin code is generated such that the factory method occurs at a fixed ordinal like 1.
Source:
An In-Depth Look into the Win32 Portable Executable File Format - Part 1, Part 2
Linkers and Loaders by John Levine - Link
References:
What Goes On Inside Windows 2000: Solving the Mysteries of the Loader - Link
PE Format
As against to ELF format, windows uses PE format (for both programs and DLLs). This format has evolved since Win 95 days. PE and ELF essentially have a large feature set in common.
There are separate import and export sections (.edata and .idata) in DLL files. This section is used for symbol look up. However on Linux, information on all non-hidden symbols are available in a single symbols section.
Preferred Load Address
Windows compilers generate binaries assuming a certain load address. This is not a problem for an exe as the entire virtual address space is available at the time of exe load. But for a DLL this is not the case and the preferred load address may not be available. If load address changes, the loader takes care of adding the 'correction offset' to each and every location in the instructions referencing addresses of data/functions.
If the preferred load address is available in logical address space, there is nothing to relocate and this process can be quite fast in average case.
There is nothing similar on Linux.
Import Directory Table, Import Lookup Table & Import Address Table (IAT)
There will be one of each of these tables per imported DLL. A directory table has info on the imported DLL plus, points to the lookup and address tables.
Lookup table and Address tables are identical before the corresponding DLL is loaded. Both have arrays of entries which contain RVA (relative virtual address) pointing to the name of the referenced symbol + ordinal. When the DLL is loaded, the Address table entries are overwritten with address of the imported symbols by the loader. (The IAT tables are located in .idata section in the binary and that's how the loader locates these tables).
Note that this involves going over the Export tables entries in the imported DLL, matching names (the export tables are sorted by symbol names so that binary search can be used), finding the ordinal and fetching the address for each. This process is costly and may be done off-line with the assumption of a preferred load address - in a process called 'Binding'.
Calling/Accessing a function/data would thus translate to an indirection from the IAT table.
There are two ways in which an instruction can access IAT. That is, there are two ways to generate code.
This is close to GOT/PLT used on Linux.
Accessing Imported Symbols
The instruction that wishes to use an imported symbol would naturally have to have the address of an IAT entry. This is done in one of two ways:
a. Import Library: This library provides stubs which satisfy the missing imported symbol references. The generated stub has a JMP instruction which has an address constant pointing to an entry in IAT. The instruction would get the contents of that address and jump to it.
CALL 0x0040100C
. . .
0x0040100C:
JMP DWORD PTR [0x00405030]
Although inefficient, this is the method that is used by default by the compiler. The reason is that the compiler can't distinguish between a call to a normal function as opposed to an imported one.
b. Direct addressing:
CALL DWORD PTR [0x00405030]
The address constant above points to an entry in IAT. This is more efficient compared to using stub function. You can direct the compiler generate this type of code by prefixing __dllspec(dllimport)
There is nothing equivalent to import library method on Linux.
Binding
An executable is said to be bound when the IAT table in the file is overwritten with the actual addresses of the symbols in other DLLs (of course, assuming DLL is loaded in preferred address). This can result in a performance boost depending on how large the imported DLL is. For large DLLs with thousands of APIs, this makes sense.
If load address is different, the loader will fix up the addresses. The loader will also have timestamp and checksum of the DLL, so that when either of them change, binding is invalidated and redone.
Binding is sometimes done during installation.
There is nothing equivalent to Binding on Linux.
Explicit and Implicit linking
Implicit linking is the default method. Explicit linking is explicitly making sure that the target DLL is loaded and then looking up the address of the APIs. This is almost always done via the LoadLibrary and GetProcAddress APIs.
Ordinals
Ordinals are numbers which uniquely point to imported/exported symbols. They also form (directly or indirectly) form indices to the actual entries in the import/export tables, which contain more information on the symbols, like name, RVA.
A program can directly access an imported API via explicit linking. But this should only be done if it is guaranteed that a given symbol occurs in that ordinal always. Otherwise, in explicit linking, one should use GetProcAddress to lookup symbols by name. Ordinals are sometimes used in plugin systems to locate a factory method, to instantiate the core class, and the plugin code is generated such that the factory method occurs at a fixed ordinal like 1.
Source:
An In-Depth Look into the Win32 Portable Executable File Format - Part 1, Part 2
Linkers and Loaders by John Levine - Link
References:
What Goes On Inside Windows 2000: Solving the Mysteries of the Loader - Link