Undocumented Corner

The Pentium Pro's undocumented Physical Address Extensions (PAE) let the processor address up to 64 GB of physical memory (36-bit address bus), and access page sizes of 2 MB.


July 01, 1996
URL:http://www.drdobbs.com/embedded-systems/undocumented-corner/184409926

July 1996: Undocumented Corner

Paging Extensions for the Pentium Pro Processor

Robert R. Collins

Robert is a design verification manager at Texas Instruments' Microprocessor Design Center. He can be reached at [email protected].


As early as June 1991, Intel circulated confidential documents describing the new features of their P5 processor, eventually known as the "Pentium Processor." Most of those new features would be shrouded in controversy, their details kept secret by Intel. But one of the most advanced features-Physical Address Extensions (PAE)-was entirely removed. PAE gave the processor the ability to address up to 64 GB of physical memory (36-bit address bus), and access page sizes of 2 MB. The larger physical-address space and the new 2-MB paging features were interrelated, as both were enabled by the same control bit. In all other operating modes, the normal 32-bit address space was in operation.

PAE would have been enabled with CR4, bit 5. When CR4.PAE=1 (CR4[5]=1), PAE (36-bit addressing) and large 2-MB pages would be accessible. When CR4-.PAE=0, A[35..32] would be forced to 0, regardless of what addresses could be generated in protected mode (when a descriptor pointing near 4 GB is combined with an offset that results in an address above 4 GB). Even when CR4.PAE=1, addresses above 4 GB would not be generated unless they were the result of a paging translation. The only means to access memory above 4 GB was through these extensions to page mode.

Before the Pentium went into production, PAE was removed. Even so, documented references to this feature still appear in the various Pentium manuals, and from other sources. In the Pentium Processor User's Manual Volume 1 (1993 edition), "Chapter 2: Overview," paragraph 8 mentions "extensions to the architecture which allow 2 Mbyte and 4 Mbyte page sizes." This reference was removed in the next edition of this manual. In the Pentium Processor User's Manual Volume 1 (all editions), Figures 3 and 4 show that the linear address is composed of four fields, one of which is called the "directory pointers." The directory pointers are unique to PAE. The Pentium Processor at iCOMP Index 735\90 MHz (Intel part number 241997 all revisions), Section 1.1, paragraph 5, also mentions 2-MB paging extensions. This same reference is present in other versions of this data sheet, using different part numbers and speed ratings (242323 all revisions). Ironically, this reference is absent in at least one analogous data sheet (241595). It is also worth mentioning that the Intel-designed in-circuit test probe also included software support for PAE.

The most significant and interesting details on PAE come from the Pentium Processor Family Developer's Manual, volume 3, and from an Intel Architecture Labs (IAL) CD-ROM (Special Edition: P6 Processor Software Developer CD April 1995). In my previous column, I discussed page-size extensions (PSE) on the Pentium and mentioned that setting some reserved bits in various page tables would cause a page fault when an access was made to that structure (see "Understanding Pentium's 4-MB Page Size Extensions," DDJ, May 1996). This reference was quoted from the Pentium Pro Processor Reference Manual, section 23.2.14.1:

A Page Fault exception occurs when a 1 is detected in any of the reserved bit positions of a page table entry, page directory entry, or page directory pointer during address translation by the Pentium processor.

I wrote that it might be tempting to associate the page-directory pointer as CR3 and that any such association would be incorrect. In reality, this is another reference to PAE that hasn't been removed from the Pentium documentation. As amusing as these references might be, the most expository ones are in a place that you might not have thought to look-the glossary. These glossary references virtually give away PAE detail:

Finding the glossary, however, may prove to be more difficult. The 1993 edition of Volume 3 (241430-001) is the only hard-copy edition that contains the glossary. The early-1994 edition (241430-002) mentions the glossary in the "Table of Contents," but it is missing from the manual. Ironically, the late-1994 electronic edition (241430-003, available on CD-ROM from Intel) contains the glossary, but its hard-copy counterpart does not. Finally, neither the latest electronic version nor hard-copy of Volume 3 (241430-004) contains the glossary.

Searching the IAL P6 CD-ROM turned up a near-perfect description of the PAE paging-translation mechanism, the only error being the association with 32-bit addressing and 2-MB page sizes. Embedded in a small, obscure file on this CD is the following information:

What is 36-Bit Addressing?

This is achieved by enabling CR4.PAE=1 (Page Addr Ext.). Page Translation occurs by Bits 30&31 pointing to an entry in a page dir ptr table (a new structure pointed to by CR3). Page directory pointer table points to page directory and bits 21-29 point to entry in page directory. Page directory entry points to beginning of page table with bits 12-20 pointing to entry in that table. That entry points to page frame with bits 0-11 pointing to offset in page. Page sizes are 2M when using 32-bit addresses. (Request OS Writer's Guide if more information is required.)

We can deduce from this reference that PAE, like PSE, supports two page sizes-4-KB pages and 2-MB pages. Bits 0-11 of the linear address point to an offset in the page frame. This indicates 12 bits of addressability in this mode (212), or a 4-KB page size. When the PS bit in the page directory equals 1, bits 0-11 of the linear address are combined with bits 12-20 to allow 21 bits of addressability (221), or a 2-MB page size.

Other evidence of PAE may be found by examining the Pentium architecture itself. CR4[5] is marked reserved, and was to enable PAE; CPUID.flags[6] is marked reserved and was to indicate the existence of PAE; the model-specific register (MSR) TR8 is marked reserved and was to contain the upper-four address bits used for TLB testability.

PAE and 36-bit addressing are now implemented in the Pentium Pro Processor (P6). I managed to get a copy of the official Intel documentation just as the article was being finished. As a result, the description of PAE was deduced from these previously documented sources and from reverse engineering on a P6-based computer, not the P6 manuals.

How PAE Works

To support 36-bit addressing, it's necessary to make substantial changes to the paging mechanism. Thirty-two-bit linear addresses are still used, but they are translated to 36-bit physical addresses. Intel chose to use a three-tier paging mechanism to support PAE for 4-KB pages, and a two-tier mechanism for 2-MB pages. With PAE enabled, CR3 points to a small Page Directory Pointers Table (PDPT). Each PDPT entry references a separate page directory. Each page directory points to a page table (for 4-KB pages), or directly to the page frame (for 2-MB pages). Table 1 provides a detailed description of all of the CPU structures associated with page translations while PAE is enabled. For comparative purposes, Table 2 gives a detailed description of all of the CPU structures associated with page translations while PSE is enabled (4-MB pages). The description of these fields may be found in the appropriate Pentium documentation. Table 3 describes the fields associated with Table 1 and Table 2, which may need further clarification.

For the most part, the paging-translation process works the same as it always has. Linear addresses are converted to physical addresses through a series of table lookups. The most significant changes in the PAE implementation are the extra table lookup (extra level of indirection) and the changes in the paging structures themselves. When PAE is enabled, an extra level of indirection is added (the PDPT lookup). CR3 points to a 4-entry PDPT, with entries that point to the base of separate page directories. This behavior is different from all previous implementations, where CR3 pointed to the base of a single page directory. The size of the paging structures have doubled to accommodate the extra 4 bits in the base address field, but otherwise, they are virtually identical to their predecessors. The significance of these changes is twofold:

The paging translation process is virtually identical to the prior implementation. The translation process is identical for 2-MB and 4-KB pages, except the page table reference is omitted: When a 32-bit linear address is presented to the paging unit, it is broken down into various fields:

In all cases, the base addresses contained in these tables (PDPT, PDE, and PTE) are physical addresses and are not subject to any paging translation. Figure 1 shows the page-address translation for 2-MB and 4-KB pages when PAE is enabled.

Caveats of PAE

New features are subject to caveats and bugs, and PAE is no exception. In this case, however, the caveats are understandable, and in some cases desirable:

Conclusions

PAE may have been a feature that had come before its time. In 1991, when it was secretly introduced to developers, very few computers needed anything even approaching 4 GB of physical memory. For whatever reason, Intel decided to remove this feature before the Pentium went into production. As with the Appendix H features from the Pentium Processor, not all documentation references to this feature were removed. Tracking down all of these references allowed me to write source code to test this feature months before Intel released Pentium Pro documentation; see the file PAE.ASM, available electronically, and at http:// www.x86.org.

In many ways, this new paging mode is more desirable than 4-MB pages. By allowing page sizes of 4 KB and 2 MB, the operating system has better control over paging large data structures. Page sizes of 4 KB are obviously too small for these large structures, and 4-MB pages are often times too large. Therefore 2-MB pages are a good compromise for efficiently paging large data structures. Unfortunately, since PAE wasn't introduced five years ago, its data structures aren't backward compatible. It will be much harder for operating-system developers to justify using this new feature, since it is one generation removed from other new paging features on the Pentium.

Table 3: Description of paging structure fields.

      RSV      Reserved. If set (RSV=1), may cause a page fault. Setting this bit only 
               causes a page fault during page translation. If the referenced page entry 
               is in the TLB, then setting this bit and referencing the page will not cause a
               page fault.  If the entry is not in the TLB, or gets flushed from the TLB, 
               then the next reference to this structure will cause a page fault. The page 
               fault error code on the stack will have the RSV bit set (bit-3). If a reserved 
               bit in the PDPT is set, a GPF is generated when this struct is accessed. 
               This only occurs when CR3 is loaded, thereby implicitly loading the PDPT.

      0        Reserved, does not cause a page fault. When this bit is set, a page fault 
               does not occur, even though this bit is reserved.

      PS*      This bit is always set=1. When set=1, this page directory entry points to a 
               2-MB or 4-MB page.

      PS**     This bit is always clear=0. When clear, this page directory entry points to 
               a page table.

Table 4: PAE/PSE page-size precedence.

      CR4.PAE      CR4.PSE      PDE.PS      Page Size      # Address Bits

      0            0            0            4 KB            32

      0            0            1            4 KB            32

      0            1            0            4 KB            32
     
      0            1            1            4 MB            32

      1            0            0            4 KB            36

      1            0            1            2 MB            36

      1            1            0            4 KB            36

      1            1            1            2 MB            36



Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.