AP Initialization
Even in SOCs, there is the likelihood of having multiple CPU cores. Each core may be visualized as a Board Support Package (BSP) plus an AP. The BSP starts and initializes the system. The APs must be initialized with identical features. Before memory is activated, the APs are uninitialized. After memory is started, the remaining processors are initialized and left in a wait-for-SIPI state. To accomplish this, the system firmware must:
- Find microcode and copy it to memory.
- Find the CPU code in the Serial Peripherals Interface (SPI) and copy it to memory an important step to avoid execution-in-place for the remainder of the sequence.
- Send start-up interprocessor interrupts to all processors.
- Disable all NEM settings, if this has not already been done.
- Load microcode updates on all processors.
- Enable cache-on mode for all processors.
From a UEFI perspective, AP initialization may either be part of the PEI or DXE phase of the boot flow, or in the early or advanced initialization. There is some debate as to the final location.
Since Intel processors are packaged in various configurations, there are different terms that must be understood when considering processor initialization. In this context, a thread is a logical processor that shares resources with another logical processor in the same physical package. A core is a processor that coexists with another processor in the same physical package and does not share any resources with other processors. A package is a chip that contains any number of cores and threads.
Threads and cores on the same package are detectable by executing the CPUID instruction. Detection of additional packages must be done blindly. If a design must accommodate more than one physical package, the BSP needs to wait a certain amount of time for all potential APs in the system to "log in." Once a timeout occurs or the maximum expected number of processors "log in," it can be assumed that there are no more processors in the system.
In order to wake up secondary threads or cores, the BSP sends a SIPI to each thread and core. This SIPI is sent by using the BSP's LAPIC, indicating the physical address from which the AP should start executing. This address must be below 1 MB of memory and must be aligned on a 4-KB boundary. Upon receipt of the SIPI, the AP starts executing the code pointed to by the SIPI message. Unlike the BSP, the AP starts code execution in real mode. This requires that the code that the AP starts executing is located below 1 MB.
Because of the different processor combinations and the various attributes of shared processing registers between threads, care must be taken to ensure that there are no caching conflicts in the memory used throughout the system.
AP behavior during firmware initialization is dependent on the firmware implementation, but is most commonly restricted to short periods of initialization followed by a HLT instruction, during which the system awaits direction from the BSP before undertaking another operation.
Once the firmware is ready to attempt to boot an OS, all AP processors must be placed back in their power-on state. The BSP accomplishes this by sending an Init Assert IPI followed by an Init De-assert IPI to all APs in the system (except itself).
The final part of this article, which will appear in January, covers advanced device installation, memory-map configuration, and all the other steps required to prepare the hardware for loading the operating system.
This article is adapted from material in Intel Technology Journal (March 2011) "UEFI Today: Bootstrapping the Continuum," and portions of it are copyright Intel Corp.
Pete Dice is a software architect in Intel's chipset architecture group. He holds a bachelor of science degree in electrical engineering. Dice has over 19 years of experience in the computing industry, including 15 years at Intel.