Detecting Intel Processors
Knowing the generation of a system CPU
Robert is a design verification manager at Texas Instruments' Microprocessor Design Center. Robert can be reached via e-mail at [email protected].
The debate about the correct way to detect different generations of Intel microprocessors has raged for years. In one corner are programmers who traditionally used a series of PUSHF/POPF instructions to detect the FLAGs differences between processors. In the other corner, it always seemed I stood alone, arguing that this technique is flawed. The debate subsided somewhat in 1989, when Intel published an algorithm that relied upon PUSHF/POPF for microprocessor identification. But even while the naysayers said, "See, even Intel does it our way," I stood in my little corner saying "Sure, but it's wrong."
The truth is, neither algorithm is fail-safe. Intel's PUSHF/POPF method can misdiagnose which processor family is running and does not guarantee to operate outside of real mode. My technique should always run in v86 mode, but sometimes doesn't because of shortcomings in the design of many v86-memory managers-like EMM386 from Microsoft.
Intel's Algorithm
All current-generation Intel x86 processors have an instruction called CPUID that reads CPU identification information. This information can be used by software to dynamically take advantage of processor-specific programming techniques. Before CPUID, you needed to write an algorithm to detect differences between different generations of processors. This algorithm would serve much of the same purpose as executing the CPUID instruction. Intel didn't invent the algorithm; the company borrowed one that was in wide distribution on the Internet, and published it in the i486 Microprocessor Programmer's Reference Manual (Intel Corp. 1990), claiming "Copyright Intel Corporation." Oddly, the original algorithm was published in two halves, in opposite ends of the manual. Section 22.10 contained the algorithm to detect the differences between 8086 through 80386. Figure 3-23 contained the algorithm to detect the difference between the 80386 and 80486. The latest edition of this manual removes the code fragments, referring you to "AP-485, Intel Processor Identification With the CPUID Instruction," Order Number 241618 (ftp://ftp.intel.com/ pub/IAL/software_specs/ap48504f.pdf).
AP-485 includes the following comment:
Please understand that the code sequences have been validated by Intel to detect CPU_ID, math coprocessor function, and initialize accordingly. Any other approach may produce unpredictable results in future processors.
It's ironic that Intel claims that "any other approach may produce unpredictable results," since its algorithm is prone to failures that yield unpredictable results (as I'll demonstrate in this article). For more information on CPUID, see the text box "Pentium Detection," by Robert Moote (which accompanied the article "Processor-Detection Schemes," by Richard C. Leinecker, DDJ, June 1993).
The Intel algorithm relies on a series of PUSHF/POPF instructions to set and clear various FLAGs bits. Each generation of processor has a slightly different behavior which may be detected by this approach. This algorithm makes no attempt to detect the 80186/88 series of processors. In this regard, the algorithm is incomplete.
The 8086/88 is distinguished from the 80286 by attempting to clear bits 12-15 of the FLAGs register. The 8086/88 will always set these bits, regardless of what values are popped into them (see Listing One). The 286 treats these bits differently. In real mode, these bits are always cleared by the 286; in protected mode, they are used for IOPL (I/O Privilege Level) and NT (Nested Task). To continue the detection code, you need to set bits 12-15 in the FLAGs register, and see if they are cleared by the processor. If they are, then a 286 has been detected (see Listing Two).
If you get beyond this point in the algorithm, you know you have at least a 386. Therefore, it is safe to use 32-bit instructions, like PUSHFD/POPFD. This will be necessary in detecting the difference between a 386 and 486. These processors are distinguished from each other by attempting to set the AC flag in the EFLAGs register. This flag was introduced in the 486. The 386 never sets this bit, and always clears it when it is set by POPFD. Therefore, to detect the difference between these processor generations, the algorithm attempts to set this bit, to see if it is latched or cleared by the processor (see Listing Three).
At this point in the algorithm, you're almost home. To detect the difference between the 486 and the Pentium, you attempt to set another new EFLAG bit (bit-21) called the "ID flag." This flag has only one purpose-to indicate the presence of the CPUID instruction. This bit was first introduced on the Pentium, but later retrofitted into the 486. If the CPUID instruction exists on either processor, it may be executed to return the processor-identification information. 486s without the CPUID instruction will not be able to toggle this bit. Therefore, it is safe to execute a sequence of instructions on either processor that detects the processor's ability to toggle this bit (see Listing Four).
Once the algorithm gets to this point, you can execute the CPUID instruction to obtain the processor identification. This instruction can be run in any processor mode, at any privilege level. On the Pentium and 486, the CPUID instruction has two levels:
- Level 0 returns a vendor ID string in EBX:EDX:ECX, which says "GenuineIntel" when printed as ASCII text.
- Level 1 returns the processor identification signature-the same signature that appears in the EDX register after a processor RESET (see Listing Five).
The Caveats
In spite of Intel's claim, this algorithm is far from perfect. For one thing, it fails to detect the 80186/88 series of processors. Even though this processor wasn't adopted by many PC manufacturers, it was used in some computers, primarily notebook computers. The 80186/88 processor contains most of the new instructions and CPU-generated exceptions contained in the 80286. These instructions include PUSHA/POPA, PUSH immed, SHL reg,
immed, and the invalid opcode exception. The only 80286 instructions and exceptions not implemented in the 80186/88 are those specifically used for protected mode. Failure to detect this processor could prohibit the use of some software that can take advantage of these new instructions and exceptions.
This algorithm is only designed to run in real mode, not in a virtual-8086 DOS box running under Windows. This limitation is even mentioned in the 486 manual. This results from the fact that PUSHF and POPF are privileged instructions that are sensitive to the I/O Privilege Level while running in protected mode. (DOS boxes, running under Windows, run in virtual-8086 mode-a special form of protected mode.) If IOPL is not equal to three, then a general-protection fault occurs while attempting to execute these instructions. The operating system then intervenes to emulate the instruction as it sees fit. Therefore, there is no guarantee that the operating system will mimic the real-mode behavior of the specific processor under test. In reality, this may not be as big a problem as it sounds. Windows sets IOPL equal to three for DOS boxes. This renders these instructions transparent to the operating system, and they execute without generating a fault.
Not all operating systems with a DOS-compatibility box follow the example set by Windows. OS/2 Warp uses a special form of virtual-8086 mode, called Virtual Mode Extensions (VME). Running in VME affords the protection advantages of running at IOPL=2 without incurring the faults generated by PUSHF/POPF used in this algorithm. (See http://www.x86.org/vme1 for a discussion on VME.) To accommodate this behavior, Intel modified the algorithms of PUSHF/POPF to allow them to run in VME without faulting to the host operating system. When IOPL<3, PUSHF always pushes an IOPL value of three onto the stack. This doesn't cause any problems for the Intel algorithm, as none of the detection code depends upon setting or clearing these two bits alone.
Should the CPUID instruction ever return a signed number (for example, 80000001h), the Intel algorithm will fail. In Listing Five, the instruction above the designated <- symbol is a conditional jump based on a signed comparison. This is a common programming error which can easily be fixed in the Intel algorithm.
This algorithm relies on undocumented processor behavior to detect the differences between early generations of Intel processors. The use of such programming tricks violates Intel's own recommendations. Consider the following guidelines set forth in various Intel manuals:
Reserved Bits and Software Compatibility
Software should not try to identify features by exploiting programming tricks, undocumented features, or otherwise deviating from the guidelines presented in this application note.
When bits are marked as reserved, it is essential for compatibility with future processors that software treat these bits as having a future, though unknown, effect. The behavior of reserved bits should be regarded as not only undefined, but unpredictable. Software should follow these guidelines in dealing with reserved bits:
- Do not use undocumented features of a processor to identify steppings or features.
- Do not depend on the states of any reserved bits when testing the values of registers which contain such bits. Mask out the reserved bits before testing.
- Do not depend on the states of any reserved bits when storing to memory or to a register.
- Do not depend on the ability to retain information written into any reserved bits.
- When loading a register, always load the reserved bits with the values indicated in the documentation, if any, or reload them with values previously read from the same register.
These are strong guidelines set forth in Intel's documentation, and the irony of Intel's algorithm is that it violates each and every one of them. Detecting the difference between 8086/88 and 80286/88, and between 80286/88 and 80386, completely depends upon setting and clearing reserved bits in the FLAGs register, and then depends on the state of those bits when they are stored to a resultant register. Detecting the difference between 386 and 486, and between 486 and Pentium, depends upon setting an EFLAGs bit that is undefined on the previous-generation processor, then depends on that processor to clear the undefined bit. To abide by Intel's guidelines, the behavior of these undocumented FLAGs bits must be documented in their respective manuals-but they aren't. None of these differences are documented in any of the processors' respective data sheets. Processor behavior often isn't documented until many years after release. The 8086 FLAGs behavior was first described in the 386 programmer's reference manual in 1988 (nearly ten years after the 8086's introduction). The 80286 FLAGs behavior wasn't described until the Pentium manuals were introduced in 1993 (ten years after the 80286 introduction, and four years after Intel introduced this algorithm in the 486 manuals).
Even though Intel's algorithm violates all of its own guidelines, the company is partially exonerated by the Pentium programmer's reference manual, where Intel says that it's acceptable to use this algorithm to detect the differences in these processors. However, the Pentium manual doesn't change the prohibitions set forth in the 386 or 486 manuals; those prohibitions still exist. The following excerpt was taking from the Pentium Programmer's Reference Manual, chapter 5:
The setting of the flags stored by the PUSHF instruction, by interrupts, and by exceptions is different on the 32-bit processors than that stored by the 8086, and Intel 286 processors in bits 12 and 13 (IOPL), 14 (NT), and 15 (reserved). These differences can be used to distinguish what type of processor is present in a system while an application is running.
My biggest objection to this algorithm is that it's prone to failure on all processors newer than a 386. When it fails, the algorithm incorrectly determines that a 386 processor is installed in the system. The failure is caused when an interrupt occurs precisely where the <- appears in Listing Three. When this occurs, the AC flag is cleared (in real mode), and the algorithm fails to detect the correct processor type. The AC flag has always behaved in this manner, but the behavior wasn't documented until the 1994 edition of the Pentium Programmer's Reference Manual (chapter 25, description of INT instruction). There are a few ways to demonstrate this failure (assuming you're running on a 486 or later processor). You can put an HLT instruction or an INT instruction at the point designated by the "(", or run the algorithm in a loop. Eventually, a timer-tick interrupt will occur at this point. Inserting an HLT instruction will force the processor to wait for an interrupt before continuation. When the interrupt occurs, the AC flag will be cleared during its invocation. Listing Six presents source code to demonstrate this behavior.
Conclusion
The Intel algorithm isn't nearly as bad as it sounds. It has a few bugs that can easily be fixed. Intel's intentions were noble, but their implementation was flawed (see http://www.x86.org for an updated version of this algorithm). In spite of its drawbacks, the reasons this algorithm is in such widespread use are simple:
- It's conveniently available and published by Intel.
- It works-most of the time, even in v86 mode.
Listing One
pushf ; push original FLAGS
pop ax ; get original FLAGS
mov cx, ax ; save original FLAGS
and ax, 0fffh ; clear bits 12-15 in FLAGS
push ax ; save new FLAGS value on stack
popf ; replace current FLAGS value
pushf ; get new FLAGS
pop ax ; store new FLAGS in AX
and ax, 0f000h ; if bits 12-15 are set, then
cmp ax, 0f000h ; processor is an 8086/8088
mov _cpu_type, 0 ; turn on 8086/8088 flag
je end_cpu_type ; jump if processor is 8086/8088
Listing Two
or cx, 0f000h ; try to set bits 12-15
push cx ; save new FLAGS value on stack
popf ; replace current FLAGS value
pushf ; get new FLAGS
pop ax ; store new FLAGS in AX
and ax, 0f000h ; if bits 12-15 are clear
mov _cpu_type, 2 ; processor=80286, turn on 80286 flag
jz end_cpu_type ; if no bits set, processor is 80286
Listing Three
pushfd ; push original EFLAGS
pop eax ; get original EFLAGS
mov ecx, eax ; save original EFLAGS
xor eax, 40000h ; flip AC bit in EFLAGS
push eax ; save new EFLAGS value on stack
popfd ; replace current EFLAGS value
<-
pushfd ; get new EFLAGS
pop eax ; store new EFLAGS in EAX
xor eax, ecx ; can't toggle AC bit, processor=80386
mov _cpu_type, 3 ; turn on 80386 processor flag
jz end_cpu_type ; jump if 80386 processor
push ecx
popfd ; restore AC bit in EFLAGS first
Listing Four
mov _cpu_type, 4 ; turn on 80486 processor flag
mov eax, ecx ; get original EFLAGS
xor eax, 200000h ; flip ID bit in EFLAGS
push eax ; save new EFLAGS value on stack
popfd ; replace current EFLAGS value
pushfd ; get new EFLAGS
pop eax ; store new EFLAGS in EAX
xor eax, ecx ; can't toggle ID bit,
je end_cpu_type ; processor=80486
Listing Five
mov _cpuid_flag, 1 ; flag indicating use of CPUID inst.
push ebx ; save registers
push esi push edi
mov eax, 0 ; set up for CPUID instruction
CPU_ID ; get and save vendor ID
mov dword ptr _vendor_id, ebx
mov dword ptr _vendor_id[+4], edx
mov dword ptr _vendor_id[+8], ecx
mov si, ds
mov es, si
mov si, offset _vendor_id
mov di, offset intel_id
mov cx, 12 ; should be length intel_id
cld ; set direction flag
repe cmpsb ; compare vendor ID to "GenuineIntel"
jne end_cpuid_type ; if not equal, not an Intel processor
mov _intel_CPU, 1 ; indicate an Intel processor
cmp eax, 1 ; make sure 1 is valid input for CPUID
jl end_cpuid_type ; if not, jump to end
<-
mov eax, 1
CPU_ID ; get family/model/stepping/features
mov _cpu_signature, eax
mov _features_ebx, ebx
mov _features_edx, edx
mov _features_ecx, ecx
shr eax, 8 ; isolate family
and eax, 0fh
mov _cpu_type, al ; set _cpu_type with family