Portable Executable

The Portable Executable (PE) format is a file format for executables, object code, DLLs, FON Font files, and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS (device driver), and other file types. The Extensible Firmware Interface (EFI) specification states that PE is the standard executable format in EFI environments.[1]

On Windows NT operating systems, PE currently supports the IA-32, IA-64, x86-64 (AMD64/Intel 64), ARM and ARM64 instruction set architectures (ISAs). Prior to Windows 2000, Windows NT (and thus PE) supported the MIPS, Alpha, and PowerPC ISAs. Because PE is used on Windows CE, it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs. [2]

Analogous formats to PE are ELF (used in Linux and most other versions of Unix) and Mach-O (used in macOS and iOS).

Portable Executable
Filename extension.acm, .ax, .cpl, .dll, .drv, .efi, .exe, .mui, .ocx, .scr, .sys, .tsp
Internet media typeapplication/vnd.microsoft.portable-executable
Developed byCurrently: Microsoft
Type of formatBinary, executable, object, shared libraries
Extended fromDOS MZ executable
COFF

History

Microsoft migrated to the PE format with the introduction of the Windows NT 3.1 operating system. All later versions of Windows, including Windows 95/98/ME and the Win32s addition to Windows 3.1x, support the file structure. The format has retained limited legacy support to bridge the gap between DOS-based and NT systems. For example, PE/COFF headers still include an DOS executable program, which is by default a DOS stub that displays a message like "This program cannot be run in DOS mode" (or similar), though it can be a full-fledged DOS version of the program (a later notable case being the Windows 98 SE installer).[3] This constitutes a form of fat binary. PE also continues to serve the changing Windows platform. Some extensions include the .NET PE format (see below), a 64-bit version called PE32+ (sometimes PE+), and a specification for Windows CE.

Technical details

Layout

Portable Executable 32 bit Structure in SVG fixed
Structure of a Portable Executable 32 bit

A PE file consists of a number of headers and sections that tell the dynamic linker how to map the file into memory. An executable image consists of several different regions, each of which require different memory protection; so the start of each section must be aligned to a page boundary.[4] For instance, typically the .text section (which holds program code) is mapped as execute/readonly, and the .data section (holding global variables) is mapped as no-execute/readwrite. However, to avoid wasting space, the different sections are not page aligned on disk. Part of the job of the dynamic linker is to map each section to memory individually and assign the correct permissions to the resulting regions, according to the instructions found in the headers.[5]

Import table

One section of note is the import address table (IAT), which is used as a lookup table when the application is calling a function in a different module. It can be in the form of both import by ordinal and import by name. Because a compiled program cannot know the memory location of the libraries it depends upon, an indirect jump is required whenever an API call is made. As the dynamic linker loads modules and joins them together, it writes actual addresses into the IAT slots, so that they point to the memory locations of the corresponding library functions. Though this adds an extra jump over the cost of an intra-module call resulting in a performance penalty, it provides a key benefit: The number of memory pages that need to be copy-on-write changed by the loader is minimized, saving memory and disk I/O time. If the compiler knows ahead of time that a call will be inter-module (via a dllimport attribute) it can produce more optimized code that simply results in an indirect call opcode.[5]

Relocations

PE files normally do not contain position-independent code. Instead they are compiled to a preferred base address, and all addresses emitted by the compiler/linker are fixed ahead of time. If a PE file cannot be loaded at its preferred address (because it's already taken by something else), the operating system will rebase it. This involves recalculating every absolute address and modifying the code to use the new values. The loader does this by comparing the preferred and actual load addresses, and calculating a delta value. This is then added to the preferred address to come up with the new address of the memory location. Base relocations are stored in a list and added, as needed, to an existing memory location. The resulting code is now private to the process and no longer shareable, so many of the memory saving benefits of DLLs are lost in this scenario. It also slows down loading of the module significantly. For this reason rebasing is to be avoided wherever possible, and the DLLs shipped by Microsoft have base addresses pre-computed so as not to overlap. In the no rebase case PE therefore has the advantage of very efficient code, but in the presence of rebasing the memory usage hit can be expensive. This contrasts with ELF which uses fully position-independent code and a global offset table, which trades off execution time in favor of lower memory usage.

.NET, metadata, and the PE format

In a .NET executable, the PE code section contains a stub that invokes the CLR virtual machine startup entry, _CorExeMain or _CorDllMain in mscoree.dll, much like it was in Visual Basic executables. The virtual machine then makes use of .NET metadata present, the root of which, IMAGE_COR20_HEADER (also called "CLR header") is pointed to by IMAGE_DIRECTORY_ENTRY_COMHEADER[6] entry in the PE header's data directory. IMAGE_COR20_HEADER strongly resembles PE's optional header, essentially playing its role for the CLR loader.[2]

The CLR-related data, including the root structure itself, is typically contained in the common code section, .text. It is composed of a few directories: metadata, embedded resources, strong names and a few for native-code interoperability. Metadata directory is a set of tables that list all the distinct .NET entities in the assembly, including types, methods, fields, constants, events, as well as references between them and to other assemblies.

Use on other operating systems

The PE format is also used by ReactOS, as ReactOS is intended to be binary-compatible with Windows. It has also historically been used by a number of other operating systems, including SkyOS and BeOS R3. However, both SkyOS and BeOS eventually moved to ELF.

As the Mono development platform intends to be binary compatible with Microsoft .NET, it uses the same PE format as the Microsoft implementation.

On x86 Unix-like operating systems,Windows binaries (in PE format) can be executed with Wine. The HX DOS Extender also uses the PE format for native DOS 32-bit binaries, plus it can, to some degree, execute existing Windows binaries in DOS, thus acting like an equivalent of Wine for DOS.

On IA-32 and x86-64 Linux one can run Windows' DLLs under loadlibrary.[7]

Mac OS X 10.5 has the ability to load and parse PE files, but is not binary compatible with Windows.[8]

UEFI and EFI firmware use Portable Executable files for applications, as well as the Windows ABI.

See also

References

  1. ^ "UEFI Specification, version 2.4" (PDF)., a note on p.18, states that "this image type is chosen to enable UEFI images to contain Thumb and Thumb2 instructions while defining the EFI interfaces themselves to be in ARM mode."
  2. ^ a b "PE Format (Windows)". Retrieved 2017-10-21.
  3. ^ E.g. Microsoft's linker has /STUB switch to attach one
  4. ^ "The Portable Executable File From Top to Bottom". Retrieved 2017-10-21.
  5. ^ a b "Peering Inside the PE: A Tour of the Win32 Portable Executable File". Retrieved 2017-10-21.
  6. ^ The entry was previously used for COM+ metadata in COM+ applications, hence the name
  7. ^ https://github.com/taviso/loadlibrary
  8. ^ Chartier, David (2007-11-30). "Uncovered: Evidence that Mac OS X could run Windows apps soon". Ars Technica. Retrieved 2007-12-03. ... Steven Edwards describes the discovery that Leopard apparently contains an undocumented loader for Portable Executables, a type of file used in 32-bit and 64-bit versions of Windows. More poking around revealed that Leopard's own loader tries to find Windows DLL files when attempting to load a Windows binary.

External links

.exe

.exe is a common filename extension denoting an executable file (the main execution point of a computer program) for DOS, OpenVMS, Microsoft Windows, Symbian or OS/2. Besides the executable program, many .exe files contain other components called resources, such as bitmap graphics and icons which the executable program may use for its graphical user interface.

COFF

The Common Object File Format (COFF) is a format for executable, object code, and shared library computer files used on Unix systems. It was introduced in Unix System V, replaced the previously used a.out format, and formed the basis for extended specifications such as XCOFF and ECOFF, before being largely replaced by ELF, introduced with SVR4.

COFF and its variants continue to be used on some Unix-like systems, on Microsoft Windows (PE Format), in EFI environments and in some embedded development systems.

DOS MZ executable

The DOS MZ executable format is the executable file format used for .EXE files in DOS.

The file can be identified by the ASCII string "MZ" (hexadecimal: 4D 5A) at the beginning of the file (the "magic number"). "MZ" are the initials of Mark Zbikowski, one of leading developers of MS-DOS.The MZ DOS executable file is newer than the COM executable format and differs from it. The DOS executable header contains relocation information, which allows multiple segments to be loaded at arbitrary memory addresses, and it supports executables larger than 64k; however, the format still requires relatively low memory limits. These limits were later bypassed using DOS extenders.

The environment of an EXE program run by DOS is found in its Program Segment Prefix.

EXE files normally have separate segments for the code, data, and stack. Program execution begins at address 0 of the code segment, and the stack pointer register is set to whatever value is contained in the header information (thus if the header specifies a 512 byte stack, the stack pointer is set to 200h). It is possible to not use a separate stack segment and simply use the code segment for the stack if desired.

The DS (data segment) register normally contains the same value as the CS (code segment) register and is not loaded with the actual segment address of the data segment when an EXE file is initialized; it is necessary for the programmer to set it themselves, generally done via the following instructions:

In the original DOS 1.x API, it was also necessary to have the DS register pointing to the segment with the PSP at program termination; this was done via the following instructions:

Program termination would then be performed by a RETF instruction, which would retrieve the original segment address with the PSP from the stack and then jump to address 0, which contained an INT 20h instruction.

The DOS 2.x API introduced a new program termination function, INT 21h Function 4Ch which does not require saving the PSP segment address at the start of the program, and Microsoft advised against the use of the older DOS 1.x method.

Dependency Walker

Dependency Walker or depends.exe is a free program for Microsoft Windows used to list the imported and exported functions of a portable executable file. It also displays a recursive tree of all the dependencies of the executable file (all the files it requires to run). Dependency Walker was included in Microsoft Visual Studio until Visual Studio 2005 (Version 8.0) and Windows XP SP2 support tools. The latest version v2.2.10011 is not available on dependencywalker.com website but is included in the Windows Driver Kit v10.As of Windows 7, Microsoft introduced the concept of Windows API-sets, a form of DLL redirection. Dependency Walker has not been updated to handle this layer of indirection gracefully, and when used on Windows 7 and later it will likely show multiple errors. Dependency Walker can still be used for some application level debugging despite this.

As of October 2017 an Open Source C# rewrite of Dependency Walker called Dependencies.exe has been released on github.com. It does not yet offer the full range of Dependency Walker features, but has been updated to handle Windows API-sets and WinSxS (side-by-side assemblies).

Dynamic-link library

Dynamic-link library (or DLL) is Microsoft's implementation of the shared library concept in the Microsoft Windows and OS/2 operating systems. These libraries usually have the file extension DLL, OCX (for libraries containing ActiveX controls), or DRV (for legacy system drivers).

The file formats for DLLs are the same as for Windows EXE files – that is, Portable Executable (PE) for 32-bit and 64-bit Windows, and New Executable (NE) for 16-bit Windows. As with EXEs, DLLs can contain code, data, and resources, in any combination.

Data files with the same file format as a DLL, but with different file extensions and possibly containing only resource sections, can be called resource DLLs. Examples of such DLLs include icon libraries, sometimes having the extension ICL, and font files, having the extensions FON and FOT.

Executable compression

Executable compression is any means of compressing an executable file and combining the compressed data with decompression code into a single executable. When this compressed executable is executed, the decompression code recreates the original code from the compressed code before executing it. In most cases this happens transparently so the compressed executable can be used in exactly the same way as the original. Executable compressors are often referred to as "runtime packers", "software packers", "software protectors" (or even "polymorphic packers" and "obfuscating tools").

A compressed executable can be considered a self-extracting archive, where a compressed executable is packaged along with the relevant decompression code in an executable file. Some compressed executables can be decompressed to reconstruct the original program file without being directly executed. Two programs that can be used to do this are CUP386 and UNP.

Most compressed executables decompress the original code in memory and most require slightly more memory to run (because they need to store the decompressor code, the compressed data and the decompressed code). Moreover, some compressed executables have additional requirements, such as those that write the decompressed executable to the file system before executing it.

Executable compression is not limited to binary executables, but can also be applied to scripts, such as JavaScript. Because most scripting languages are designed to work on human-readable code, which has a high redundancy, compression can be very effective and as simple as replacing long names used to identify variables and functions with shorter versions and/or removing white-space.

FASM

FASM (flat assembler) is an assembler for x86 processors. It supports Intel-style assembly language on the IA-32 and x86-64 computer architectures. It claims high speed, size optimizations, operating system (OS) portability, and macro abilities. It is a low-level assembler and intentionally uses very few command-line options. It is free and open-source software.

All versions of FASM can directly output any of the following: flat "raw" binary (usable also as DOS COM executable or SYS driver), objects: Executable and Linkable Format (ELF) or Common Object File Format (COFF) (classic or MS-specific), or executables in either MZ, ELF, or Portable Executable (PE) format (including WDM drivers, allows custom MZ DOS stub). An unofficial port targeting the ARM architecture (FASMARM) also exists.

GNOME Archive Manager

Archive Manager (previously File Roller) is the archive manager of the GNOME desktop environment.

Google Native Client

Google Native Client (NaCl) is a sandboxing technology for running either a subset of Intel x86, ARM, or MIPS native code, or a portable executable, in a sandbox. It allows safely running native code from a web browser, independent of the user operating system, allowing web apps to run at near-native speeds, which aligns with Google's plans for Chrome OS. It may also be used for securing browser plugins, and parts of other applications or full applications such as ZeroVM.To demonstrate the readiness of the technology, on 9 December 2011, Google announced the availability of several new Chrome-only versions of games known for their rich and processor-intensive graphics, including Bastion (no longer supported on the Chrome Web Store). NaCl runs hardware-accelerated 3D graphics (via OpenGL ES 2.0), sandboxed local file storage, dynamic loading, full screen mode, and mouse capture. There are also plans to make NaCl available on handheld devices.Portable Native Client (PNaCl) is an architecture-independent version. PNaCl apps are compiled ahead-of-time. PNaCl is recommended over NaCl for most use cases. The general concept of NaCl (running native code in web browser) has been implemented before in ActiveX, which, while still in use, has full access to the system (disk, memory, user-interface, registry, etc.). Native Client avoids this issue by using sandboxing.

An alternative of sorts to NaCl is asm.js, which also allows applications written in C or C++ to be compiled to run in the browser (at more than half the native speed), and also supports ahead-of-time compilation, but is a subset of JavaScript and hence backwards-compatible with browsers that do not support it directly. Another alternative (while it may initially be powered by PNaCl) is WebAssembly.

On October 12, 2016, a comment on the Chromium issue tracker indicated that Google's Pepper and Native Client teams had been destaffed. On May 30, 2017, Google announced deprecation of PNaCl in favor of WebAssembly. Although initially Google planned to remove PNaCl in first quarter of 2018, the removal is currently planned in the second quarter of 2019 (except for Chrome Apps).

ILAsm

ILAsm (IL Assembler) generates a portable executable (PE) file from a text representation of Common Intermediate Language (CIL) code. It is not to be confused with NGEN (Native Image Generator), which compiles Common Intermediate Language code into native code as a .NET assembly is deployed.

JSmooth

JSmooth is a tool for wrapping Java JAR files into Windows Portable Executable EXE files. It allows specifying various details on how the program should be invoked, such as:

Executable icon

Program arguments

Type of wrapper application (console or Windows GUI)

Whether to launch the Java VM in the same process as the EXE or a separate process

Maximum and initial memory allocation

System properties available to the application via the System.getProperty functionJSmooth is distributed under the GNU General Public License, and is written in Java using Swing. Generated executables are built on MinGW, and as such there is no dependency on proprietary software.

Unlike other exe wrappers, JSmooth is 100%-Java, and can be used to create the Windows executable from a Linux compilation-chain (an ANT task is provided).

Matt Pietrek

Matt Pietrek (born January 27, 1966) is an American computer specialist and author specializing in Microsoft Windows.

Pietrek has written several books on the subject and, for eight years, wrote the column "Under the Hood" in MSJ (and subsequently) MSDN Magazine. As of April 2004 he has been working at Microsoft, initially on Visual Studio.

Pietrek also has a keen interest in cocktails and spirits, and he writes a blog on the subject.

Norton Power Eraser

Norton Power Eraser (NPE) is a small portable executable which uses Norton Insight in-the-cloud application ratings to scan a computer system. The program matches an application found on the user's computer with a list of trusted and malicious applications. If it's in the list of trusted applications, Power Eraser leaves it on the system. If it is in the list of bad applications, it is marked for deletion. If it is unknown and not in any list, it is reported as suspicious but not marked for removal. Instead, the program recommends a "remote scan", which will upload the file to Symantec's servers to check it with virus definitions.

Pe

Pe may refer to:

Language:

Pe language

Pe (Cyrillic), a letter (П) in the Cyrillic alphabet

Pe (letter), a letter (פ ,ف, etc.) in several Semitic alphabets

Pe (Persian), this letter (پ) in the Arabic alphabetIn mathematics, science, and technology:

Weierstrass p (also called "pe"), a mathematical letter (℘) used in Weierstrass's elliptic functions and power sets

Péclet number (abbreviated "Pe."), a dimensionless number used in physics

Pe (text editor), a text editor for BeOS

Petlyakov, Russian aircraft design bureau

Pulmonary emphysema, a lung disease

Pulmonary embolism, a medical condition

Portable Executable, a Microsoft Windows executable file format

Provider Edge, an Edge network routerPlaces:

Pe, Tibet, a town on the Yarlung Tsangpo River

Pe, one of two cities which merged into Buto

.pe, the Internet country code top-level domain (ccTLD) for Peru

Prince Edward Island, Canada (postal abbreviation PE)

Resource (Windows)

In Microsoft Windows, resources are read-only data embedded in portable executable files like EXE, DLL, CPL, SCR, SYS or (beginning with Windows Vista) MUI files.The Windows API provides for easy access to all applications resources.

Zmist

Zmist (also known as Z0mbie.Mistfall) is a metamorphic computer virus created by the Russian virus writer known as Z0mbie. It was the first virus to use a technique known as "code integration". In the words of Ferrie and Ször:

This virus supports a unique new technique: code integration.

The Mistfall engine contained in it is capable of

decompiling Portable Executable files to [their] smallest

elements, requiring 32 MB of memory. Zmist will insert

itself into the code: it moves code blocks out of the way,

inserts itself, regenerates code and data references, including

relocation information, and rebuilds the executable.

This page is based on a Wikipedia article written by authors (here).
Text is available under the CC BY-SA 3.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.