IA-32 supports three operating modes and one quasi-operating mode:
- Protected mode is the native operating mode of the processor. It provides a rich set of architectural features, flexibility, high performance, and backward compatibility.
- Real-address mode or "real mode" provides the programming environment of the Intel 8086 processor, with a few extensions, such as the ability to switch to protected or system management mode. Whenever a reset or a power-on happens, the system transitions back to real-address mode.
- System management mode (SMM) is a standard architectural feature in all IA-32 processors, beginning with the 386 SL. This mode provides an operating system or executive with a transparent mechanism for implementing power management and OEM differentiation features. SMM is entered through activation of an external system interrupt pin, which generates a System Management Interrupt (SMI). In SMM, the processor switches to a separate address space while saving the context of the currently running program or task. SMM-specific code may then be executed transparently. Upon returning from SMM, the processor is placed back into its state prior to the system management interrupt. The system firmware is usually responsible for creating an system management interrupt handler, which may periodically take over the system from the host OS. Legitimate workarounds are executed in the SMI handler, and handling and logging-off errors may happen at the system level. As this presents a potential security issue, there is also a lock bit that resists tampering with this mechanism. Vendors of real-time operating systems often recommend disabling this feature because it could subvert the OS environment. If this happens, then the additional work of the SMI handler would need to be incorporated into the RTOS for that platform, or else the potential exists of missing something important in the way of error response or workarounds.
- Virtual-8086 mode is a quasi-operating mode supported by the processor in protected mode. This mode allows the processor to execute 8086 software in a protected, multitasking environment.
The Intel 64 architecture supports all operating modes of IA-32 architecture plus IA-32e mode. In IA-32e mode, the processor supports two sub-modes: compatibility mode and 64-bit mode. Compatibility mode allows most legacy protected-mode applications to run unchanged, while 64-bit mode provides 64-bit linear addressing and support for physical address space larger than 64 GB.
Figure 2 shows how the processor moves between operating modes.
When the processor is first powered on, it will be in a special mode similar to real mode, but with the top 12 address lines asserted high. This aliasing allows the boot code to be accessed directly from nonvolatile RAM (physical address
Upon execution of the first long jump, these 12 address lines will be driven according to instructions by firmware. If one of the protected modes is not entered before the first long jump, the processor will enter real mode, with only 1 MB of addressability. In order for real mode to work without memory, the chipset needs to be able to alias a range of memory below 1 MB to an equivalent range just below 4 GB. Certain chipsets do not have this aliasing and may require a switch to another operating mode before performing the first long jump. The processor also invalidates the internal caches and translation look-aside buffers.
The processor continues to boot in real mode. There is no particular technical reason for the boot sequence to occur in real mode. Some speculate that this feature is maintained in order to ensure that the platform can boot legacy code such as MS-DOS. While this is a valid issue, there are other factors that complicate a move to protected-mode booting. The change would need to be introduced and validated among a broad ecosystem of manufacturers and developers, for example. Compatibility issues would arise in test and manufacturing environments. These and other natural hurdles keep boot mode "real."
The first power-on mode is actually a special subset of real mode. The top 12 address lines are held high, thus allowing aliasing, in which the processor can execute code from nonvolatile storage (such as flash memory) located within the lowest one megabyte as if it were located at the top of memory.
Normal operation of firmware (including the BIOS) is to switch to flat protected mode as early in the boot sequence as possible. It is usually not necessary to switch back to real mode unless executing an option ROM that makes legacy software interrupt calls. Flat protected mode runs 32-bit code and physical addresses are mapped one-to-one with logical addresses (that is, paging is off). The interrupt descriptor table is used for interrupt handling. This is the recommended mode for all BIOS/boot loaders.
The early phase of the BIOS/bootloader initializes the memory and processor cores. In a BIOS constructed in accord with the Unified EFI Forum's UEFI 2.0 framework, the security and Pre-EFI Initialization (PEI) phases are normally synonymous with "early initialization." It doesn't matter if legacy or UEFI BIOS is used. From a hardware point of view, the early initialization sequence is the same.
In a multicore system, the bootstrap processor is the CPU core (or thread) that is chosen to boot the system firmware, which is normally single-threaded. At
RESET, all of the processors race for a semaphore flag bit in the chipset The first finds it
clear and in the process of reading it sets the flag; the other processors find the flag
set and enter a wait-for-SIPI (Start-up Inter-Processor Interrupt) or
halt state. The first processor initializes main memory and the Application Processors (APs), then continues with the rest of the boot process.
A multiprocessor system does not truly enter multiprocessing operation until the OS takes over. While it is possible to do a limited amount of parallel processing during the UEFI boot phase, such as during memory initialization with multiple socket designs, any true multithreading activity would require changes to be made to the Driver Execution Environment (DXE) phase of the UEFI. Without obvious benefits, such changes are unlikely to be broadly adopted.
The early initialization phase next readies the bootstrap processor and I/O peripherals' base address registers, which are needed to configure the memory controller. The device-specific portion of an Intel architecture memory map is highly configurable. Most devices are seen and accessed via a logical Peripheral Component Interconnect (PCI) bus hierarchy. Device control registers are mapped to a predefined I/O or memory-mapped I/O space, and they can be set up before the memory map is configured. This allows the early initialization firmware to configure the memory map of the device as needed to set up DRAM. Before DRAM can be configured, the firmware must establish the exact configuration of DRAM that is on the board. The Intel Architecture reference platform memory map is described in detail in Chapter 6 of my book, Quick Boot: A Guide for Embedded Firmware Developers, from Intel Press.
System-on-a-chip (SOC) devices based on other processor architectures typically provide a static address map for all internal peripherals, with external devices connected via a bus interface. Bus-based devices are mapped to a memory range within the SOC address space. These SOC devices usually provide a configurable chip-select register set to specify the base address and size of the memory range enabled by the chip select. SOCs based on Intel Architecture primarily use the logical PCI infrastructure for internal and external devices.
The location of the device in the host's memory address space is defined by the PCI Base Address Register (BAR) for each of the devices. The device initialization typically enables all the BAR registers for the devices required for system boot. The BIOS will assign all devices in the system a PCI base address by writing the appropriate BAR registers during PCI enumeration. Long before full PCI enumeration, the BIOS must enable the PCI Express (PCIe) BAR as well as the Platform Controller Hub (PCH) Root Complex Base Address Register (RCBA) BAR for memory, I/O, and memory-mapped I/O (MMIO) interactions during the early phase of boot. Depending on the chipset, there are prefetchers that can be enabled at this point to speed up data transfer from the flash device. There may also be Direct Media Interface (DMI) link settings that must be tuned for optimal performance.
The next step, initialization of the CPU, requires simple configuration of processor and machine registers, loading a microcode update, and enabling the Local APIC (LAPIC).
Microcode is a hardware layer of instructions involved in the implementation of the machine-defined architecture. It is most prevalent in CISC-based processors. Microcode is developed by the CPU vendor and incorporated into an internal CPU ROM during manufacture. Since the infamous "Pentium flaw," Intel processor architecture allows that microcode to be updated in the field either through a BIOS update or via an OS update.
Today, an Intel processor must have the latest microcode update to be considered a warranted CPU. Intel provides microcode updates that are written to the microcode store in the CPU. The updates are encrypted and signed by Intel such that only the processor that the microcode update was designed for can authenticate and load the update. On socketed systems, the BIOS may have to carry many flavors of microcode update depending on the number of processor steppings supported. It is important to load microcode updates early in the boot sequence to limit the exposure of the system to known errata in the silicon. Note that the microcode update may need to be reapplied to the CPU after certain reset events in the boot sequence.
Next, the LAPICs must be enabled to handle interrupts that occur before enabling protected mode.
Software initialization code must load a minimum number of protected-mode data structures and code modules into memory to support reliable operation of the processor in protected mode. These data structures include an Interrupt Descriptor Table (IDT), a Global Descriptor Table (GDT), a Task-State Segment (TSS), and, optionally, a Local Descriptor Table (LDT). If paging is to be used, at least one page directory and one page table must be loaded. A code segment containing the code to be executed when the processor switches to protected mode must also be loaded, as well as one or more code modules that contain necessary interrupt and exception handlers.
Initialization code must also initialize certain system registers. The global descriptor table register must be initialized, along with control registers CR1 through CR4. The IDT register may be initialized immediately after switching to protected mode, prior to enabling interrupts. Memory Type Range Registers (MTRRs) are also initialized.
With these data structures, code modules, and system registers initialized, the processor can be switched to protected mode. This is accomplished by loading control register CR0 with a value that sets the PE (protected mode enable) flag. From this point onward, it is likely that the system will not enter real mode again, legacy option ROMs and legacy OS/BIOS interface notwithstanding, until the next hardware reset is experienced.
Since no DRAM is available at this point, code initially operates in a stackless environment. Most modern processors have an internal cache that can be configured as RAM to provide a software stack. Developers must write extremely tight code when using this cache-as-RAM feature because an eviction would be unacceptable to the system at this point in the boot sequence; there is no memory to maintain coherency. That's why processors operate in "No Evict Mode" (NEM) at this point in the boot process, when they are operating on a cache-as-RAM basis. In NEM, a cache-line miss in the processor will not cause an eviction. Developing code with an available software stack is much easier, and initialization code often performs the minimal setup to use a stack even prior to DRAM initialization.
The processor may boot into a slower than optimal mode for various reasons. It may be considered less risky to run in a slower mode, or it may be done to save power. The BIOS may force the speed to something appropriate for a faster boot. This additional optimization is optional; the OS will likely have the drivers to deal with this parameter when it loads.