Tovey is a senior engineer for Intel's Development Tools Operation and can be contacted at 5200 NE Elam Young Parkway, Hillsboro, OR 97124.
Since the introduction of the 80386 and 80486, software development based on them has been done in the PC platform arena. While PC manufacturers have made good use of many of the hardware enhancements incorporated in these CPUs, the constraint of backward compatibility to the original PC and its real-mode operating system has limited use of numerous architectural features that can provide a powerful, protected-mode programming environment.
Although application software writers have only recently begun to develop code that is making the PC world aware of the advantages of protected-mode programming, in embedded systems development these features have long been recognized and utilized to their fullest extent. In either case, writing and debugging protected-mode code to run on the 386/486 requires an understanding of the built-in architectural features that make protected mode possible. Both chips share features -- hidden registers, segment access rights and privilege levels, cached segment descriptor values and descriptor tables, multitasking, and paging -- that can make debugging even simple programs a daunting task. Fortunately, tools such as the in-circuit emulators can make access to the CPU -- and these seemingly complicated features -- as easy as typing in a single command.
A typical in-circuit emulator has two main sections: The emulator chassis, containing circuit boards on which the trace buffer, breakpoint logic, and so on, are found; and the probe, containing a microprocessor (much like the one being emulated) and some support logic. The emulator is installed by replacing the CPU in the target system with the emulation probe. Software controlling the functions of the emulator runs on a host computer, communicating with the emulator chassis via a parallel or serial link. The emulator probe dimensions must fit comfortably within the target system. Beyond that, the only significant special provision needed to support an emulator is a host computer (typically a PC). Commands entered at the host computer are transferred to the emulator chassis, and from there to the probe. The CPU on the probe matches or surpasses the capability of the CPU on the target, so performance during emulation is the same as performance during normal operation; timings, interrupt latencies, software functions, and so on, remain the same. The difference is that, via the host computer, you control operation of the emulation probe and by extension, the target system. That is, the target can be made to either run at full speed or execute one assembly language instruction at a time.
The emulator also has access to registers and data structures hidden inside the CPU. Manipulating these hidden features without an emulator would require elaborate procedures, and even they wouldn't provide full access. The examples in this article are based on debug sessions conducted using an Intel ICE-386 DX. Similar features are available on ICEs from Intel and other in-circuit emulator vendors.
Generating valid protected-mode data structures such as descriptors, descriptor tables, and tasks is normally done by the system integrator -- the person who ties together the disparate parts of the programs that make up the overall protected-mode environment. It's up to the system integrator to keep a list of all the various segments and their access rights, and the tasks and their associated task data structures: the task state segment and local descriptor table, the interrupt handlers and interrupt descriptor table, and the global descriptor table (where just about everything else is defined). Once these different protected-mode data structures have been built, it's the system integrator's job to load the program to the target system memory and try executing code.
Stepping into Protected Mode
Often, the first difficulty encountered by the system integrator occurs when stepping through the code that transfers the CPU from real to protected mode. The code itself is not terribly complex, just a number of register loads. However, the values placed in the registers set up the initial conditions for protected-mode operation, and can easily be totally wrong, generating immediate protection exceptions. Perhaps even worse, the register values can be subtly incorrect, so that similar-type segments are accessed in error, generating an exception only after damage has been done.
Example 1 shows 386 code that places the CPU into protected mode. Once the Global Descriptor Table (GDT) register has been loaded to point to the area of memory where the GDT will reside, and the Protection Enabled (PE) bit in the CRO register has been set, the CPU is in protected mode. The next instructions load up the various segment registers with protected-mode values. To answer the questions that arise at this point, you single-step through the code, checking appropriate memory and register values as you go. Unfortunately, without an emulator it is almost impossible to single-step this code without running into problems.
Example 1: 386 code that places the CPU into protected mode
lgdt pword ptr gdt_reg_values ; Load global descriptor table register mov eax, cr0 ; Set Protection Enabled bit to go into protected or eax,1 ; mode mov cr0,eax jmp next ; Flush prefetch queue to get rid of instructions ; decoded in real mode next: xor bx,d_seg_selector; Initialize data selectors with appropriate mov ds,bx ; values - here, we see FLAT model mov es,bx ; initialization mov fs,bx mov gs,bx mov ss,bx pejump: jmp full_prot_code ; FAR jump, loads CS register with protected mode ; value and branches to full protected mode code
Debuggers create the effect of single-stepping by setting the Trap Flag (TF) bit in the EFLAGS register of the CPU. With the TF bit set, starting execution causes the chip to execute one instruction, then automatically generate an interrupt. The interrupt clears the TF bit while the interrupt code places the debugger back in command mode, ready for another single-step. Debuggers typically start executing application code by setting up a return address on the stack which points to the instruction where execution is to begin, then performing a RET instruction. This works quite well, except when there is a transition between real and protected mode. Then the debugger starts out in real mode, the application switches to protected mode, and suddenly the debugger gets confused.
An emulator lets you single-step through this type of code, and gives you access to the important hidden values, allowing you to answer the following questions:
1. Is there a problem because the GDT register hasn't been correctly loaded and is pointing somewhere other than at the actual GDT space? It's simple enough to use and emulator to display the contents of the GDT register; see Example 2(a). The values returned by the emulator show the base address of the GDT set at physical address 11000, and a GDT length of 77H bytes. (The default numeric base of the emulator is hex.) At 8 bytes per descriptor, starting at offset 0 from the beginning of the GDT, there must be 15 descriptors in the table.
Example 2: (a) Using an emulator to display the contents of the GDT register; (b) displaying the address of the variable holding the initial GDT register values; (c) displaying the contents of the variable.
(a) hlt> gdtbas /* Display the base field of the GDT register */ 11000 hlt> hlt> gdtlim /* Display the limit field of the GDT register */ 77 (b) hlt> &gdt_reg_values /* Display address of variable. */ some-address-here /* The address would be displayed in */ /* virtual format, i.e., seg:offset or */ /* ldt:seg:offset, depending on whether */ /* code is in real or protected mode */ (c) hlt> byte &gdt_reg_values L length 6 0ffff0168L 77 00 00 10 01 00
If the values in the GDT register don't make sense, it may be that the program is loading them from the wrong memory location, or that the correct location is being accessed but the values at the location are incorrect. In either case, the emulator can help prove the accuracy of the actions being taken.
Displaying the address of the variable holding the initial GDT register values is simple; see Example 2(b). Displaying the contents of the variable is almost as simple; see Example 2(c). Here, the BYTE command is used to display 6 bytes of memory starting at the linear (L) address of the variable. The response displays the address as well. (Bytes are displayed in Little Endian format.)
2. Is the memory location d_seg_selector holding the correct value? As we've seen, it is simple to display the contents of memory. If the program has been loaded with full symbolic debug information, the emulator can display the contents of a variable directly; see Example 3.
Example 3: If the program has been loaded with full symbolic debug information, the emulator can display the contents of a variable directly.
hlt> d_seg_selector 003B
3. Have the correct segment descriptor values been read from the GDT and cached into the segment registers? If program execution gets beyond protected-mode initialization and begins generating exceptions on data or stack accesses, the segment registers may have been loaded with invalid values. The code in Example 1 shows that all the segment registers (except CS) have been loaded with the same selector value. This is typical of a program that runs in FLAT mode and means we only need to check one descriptor to find out whether or not the selector is valid.
We already know the value of d_seg_selector is 3BH. We can decode this by hand: 003BH = 0000 0000 0011 1011, where: bits 0,1 (the RPL, or Requested Privilege Level) are three, the least privileged; bit 2 (the TI, or Table Indicator bit) shows that the descriptor is in the GDT; and bits 3 through 13 (the table index) show that descriptor is in slot #7.
Now that we know selector 38H refers to descriptor #7 in the GDT, and we know the base address of the GDT, we can display the appropriate memory locations and decode the descriptor contents. While an example of this work would be instructive, it is much easier to use emulator descriptor table commands to do the work for us.
In Example 4(a), the DT command decodes the selector, reads the appropriate descriptor, and decodes its value. The first line of the response shows us that the selector 3BH refers to descriptor #7 in the GDT, and also shows the 8 bytes of data contained in the descriptor. The second line decodes the data and gives information about the segment: access rights, privilege level, and so on. See Table 1 for details.
Example 4: (a) The DT command decodes the selector, reads the appropriate descriptor, and decodes its value; (b) if the segment limit appears too small, the DT command can be used to change it to (b) or (c).
(a) hlt> dt (3BH) GDT(7T) 0040F30116C0017B DSEG BASE=000116C0 LIMIT=0017B DPL=3 P=1 G=0 V=0 B=1 E=0 W=1 A=1 (b) hlt> dt(38) .limit = 27B (c) hlt> dt(38) .limit = dt(38) .limit + 100
Table 1: Meaning of the response to the DT command shown in Example 4
Response Meaning ------------------------------------------------------------------------- DSEG The segment is a data segment. BASE The base address of the segment is 116C0H. LIMIT The length of the segment is 17BH bytes. DPL The descriptor privilege level is 3, least privileged. P The present bit shows that the descriptor is marked present in memory. G The granularity bit shows that the limit field contains a value that should be considered the least significant bits of a 20-bit limit (maximum length 1 Mbyte). V The V bit is user-definable and may be used for any purpose. A typical use is marking a segment "locked," so it's never swapped out of memory. B The B bit determines the default stack pointer register used for stack accesses (SP or ESP); for stack segments only. E The E bit determines whether a segment is a data segment (0) or code segment(1). W The W bit determines if a segment is read only (0) or read/write (1).
With the information displayed by the DT command, it is simple to determine whether appropriate descriptor values are present. If on examination, a descriptor appears to hold incorrect values, the DT command can also be used to change descriptor field values. For instance, if the segment limit appears too small, the DT command can be used to change it. See Example 4(b) or 4(c).
4. Is the FAR jump really branching to the right place? This question can be answered by simply displaying code at the branch address shown in Example 5.
Example 5: Using the ASM command to display memory at the specified address
hlt> asm full_prot_code length 5 ; :TASK_1.PROC_A.full_prot_code 0098:0014:00000000H 1E PUSH DS 0098:0014:00000000H 66B9F900 MOV CX,0F9H 0098:0014:00000000H 8ED9 MOV DS,CX 0098:0014:00000000H 8EC1 MOV EX,CX 0098:0014:00000000H 55 PUSH EBP
Here we see the ASM command, which displays memory at the specified address as assembly language instructions. The emulator responds by displaying the fully qualified address of the variable full_prot_code, including the module and procedure in which the variable is defined. The instructions follow, with full logical address (LDT_selector:segment_selector:offset) and opcode/operands.
Descriptor Table Access/Display
Just as emulators make it easy to access and change the contents of single descriptors, they also make it easy to display descriptor tables. This is handy when working with protected-mode programs that include numerous descriptors and multiple local descriptor tables.
The GDT command displays the contents of the global descriptor table as pointed to by the GDT register. For brevity, Example 6 shows only excerpts from the table.
Example 6: The GDT command displays the contents of the global descriptor table as pointed to by the GDT register.
hlt> gdt /* Display the contents of the GDT */ GDT(1T) 00009201100000FF DSEG BASE=00011000 LIMIT=000FF DPL=0 P=1 G=0 V=0 B=0 E=0 W=1 A=0 GDT(17T) 00409A01146C0055 ESEG BASE=0001146C LIMIT=00055 DPL=0 P=1 G=0 V=0 D=1 C=0 R=1 A=0 GDT(19T) 0000820112000027 DATBL BASE=00011200 LIMIT=00027 DPL=0 P=1 G=0 V=0 GDT(26T) 0000EC0000150000 CALLG3 SSEL=0015 SOFF=00000000 DPL=3 P=1 WCO=00
The example includes display of descriptors for data segments (DSEG), executable code segments (ESEG), a descriptor for a segment containing a local descriptor table (DTABL), and a descriptor containing a call gate to 32-bit code (CALLG3). Descriptor types not shown in the example include available and busy task state segments (ATSSD3, BTSSD3), task gates (TASKG), and so on. The bits seen in the ESEG display, which are different from those described in the data segment in Example 6 are listed in Table 2.
Table 2: The bits seen in the ESEG display that are different from those
described in the data segment in Example 6 Response Meaning ------------------------------------------------------------------------- D The D bit determines the default code type, either 16-bit code or 32-bit code. The code type determines the default register size and addressing capabilities (that is, 16- or 32-bit index registers). C The C bit determines whether or not a code segment is conforming. R The R bit determines whether the code segment is execute only or execute/read.
In a similar fashion, it's possible to display the Interrupt Descriptor Table (IDT). This is useful when debugging protected-mode interrupts, where two levels of indirection can make it difficult to access interrupt handler code. (Interrupt descriptors contain selectors for code segments which in turn contain interrupt handler code.) Again, for brevity, Example 7 shows only a portion of the full display. Field names within the interrupt descriptors are listed in Table 3.
Example 7: Displaying the Interrupt Descriptor Table (IDT). This can be useful when debugging protected-mode interrupts.
hlt> idt IDT(0T) FFFF8E00001803A4 INTG3 SSEL=0018 SOFF=FFFF03A4 DPL=0 P=1 IDT(1T) FFFF8E00001803A8 INTG3 SSEL=0018 SOFF=FFFF03A8 DPL=0 P=1 IDT(2T) FFFF8E00001803AC INTG3 SSEL=0018 SOFF=FFFF03AC DPL=0 P=1
Table 3: Field names within the interrupt descriptors; see Example 7.
Response Meaning ------------------------------------------------------------------------- INTG3 Defines the descriptor as an Intel386 interrupt gate. SSEL This field contains the code segment selector for the interrupt handler. SSOFF This field contains the offset to the interrupt handler.
It is also possible to display the current local descriptor table, as described by the LDT register, using the LDT command. All three descriptor table commands not only allow display of entire descriptor tables, but also allow access to display single descriptors. Thus, if you already know the descriptor number, you can display its contents. See Example 8(a).
Example 8: (a) Displaying the current local descriptor table using the LDT command; (b) writing the limit field of the descriptor in slot #3 of the LDT, whose descriptor is in slot #7 of the GDT.
(a) hlt> gdt(19t) DGT(19T) 0000820112000027 DTABL BASE=00011200 LIMIT=00027 DPL=0 P=1 G=0 V=0 (b) gdt(7).1dt(3).limit = 12345H
More importantly, if you are working with multitasking programs, there are multiple LDTs. At times, you might wish to display or change the contents of a descriptor in an LDT not currently in use. An LDT is really nothing more than a data segment containing special operating system-related data: the descriptors for the segments in a given task. The LDT is a special segment, so its descriptor resides in the GDT. Example 8(b) shows how to write the limit field of the descriptor in slot #3 of the LDT whose descriptor is in slot #7of the GDT.
Multitasking
Protected-mode programs meant to run under multitasking operating systems are not difficult to write. Each program is written separately, as if it were to run on its own. Even operating system code, which must handle task switching, is not that difficult to write because of on-chip hardware that assists in the task switching process. However, debugging multitasking protected-mode systems can be a difficult job because all the on-chip hardware features that aid in task switching can obscure the different operations and functions that make up the task switching process.
As shown in many of the previous examples, an emulator can be used to display and access many of the more difficult hardware and software aspects of protected-mode programming. Task switching not only adds multiple LDTs, but also adds Task State Segments (TSSs) for each task. Each TSS is a special segment used to save/restore the processor registers when entering or leaving a task. In this way, the processor can restore the TSS values, reentering a task with the registers set up just as they were before the task was last exited.
An emulator can be used to display/change the contents of a TSS; see Example 9(a). The TSS command shows us the current TSS referred to by the task register. We see that the current task is a 386 task, rather than a 286 task. We also see the stack segment register and associated stack pointer register for each of the four privilege levels. This information makes it much easier to debug stack-related problems when multiple privilege levels are in use. We also see the selector for the current LDT and the "back link" field, which contains a return TSS selector value if the current task has been called by a previous task.
Example 9: (a) Using an in-circuit emulator to display/change the contents of a TSS; (b) displaying the contents of any task using the task's TSS selector; (c) displaying the contents of the privilege level #2 stack pointer for the TSS; (d) locating the contents of a segment in a noncurrent task by knowing the selector to the task's TSS; (e) using the DT command to display the task's LDT descriptor; (f) displaying the LDT.
(a) hlt> tss 386 TSS SSO= 00f0 ESP0= 00000101 SS1= 001D ESP1= 00000101 SS2= 0000 ESP2= 00000000 EAX= 00000000 EBX= 00000000 ECX= 00000000 EDX= 00000000 DS= 00FB ES= 00fb FS= 00fb GS= 00fb ESI = 00000000 EDI= 00000000 SS= 001d CS= 0025 ESP= 00000101 EIP= 00000000 EBP= 00000101 LDTR=00b0 LINK= 0068 EFLAGS= 00000000 CR3= 00000000 (b) hlt> tss(50) (c) hlt> tss(50).esp2 (d) hlt> tss(50).ldtr 0068 (e) hlt> dt(68) GDT(13T) 0000820112000027 DTABL BASE=00011200 LIMIT=00027 DPL=0 P=1 G=0 V=0 (f) hlt> gdt(13t).ldt /* For brevity, the LDT will not be shown */
It is also possible to display the contents of any task state segment, as long as you know the task's TSS selector. The emulator simply looks up TSS's descriptor in the GDT and displays the appropriate TSS's contents. The following example shows the command which would display the contents of a TSS whose TSS descriptor has a selector of 5OH; see Example 9(b).
Each of the fields within the TSS is accessible via a field name. For example, it's possible to display the contents of the privilege level #2 stack pointer for the TSS in Example 9(b) by way of Example 9(c). This type of command can be extremely useful if you wish to locate the contents of a segment in a noncurrent task, but you don't know the selector for the segment nor its base address. As long as you know the selector to the task's TSS you can first display the LDT register field of the task; see Example 9(d). (If you don't know the task's TSS, you can always display the GDT and experiment with the different TSSs found there. The GDT will also display the different LDT descriptors, from which you can choose and experiment.)
Then you can use the DT command, see Example 9(e), using the selector to display the task's LDT descriptor. We now know the slot number of the task's LDT descriptor; with this information we can display the LDT; see Example 9(f).
The display of the LDT should allow you to choose the segment in which you are interested. You now know the base address of the segment, its limit, and whether or not it is present in memory. At this point, it is possible to use emulator commands to display memory within the segment of interest.
Hidden Register Access
I've already shown that an emulator allows display and change of the hidden fields within the GDT register. An emulator also allows the same access to other hidden registers and register fields. Example 10 shows emulator commands to display the contents of the LDT register, the task register, the IDT register, and various fields within the segment registers.
Example 10: Emulator commands to display the contents of the LDT register, task register, IDT register, and various other fields
hlt> ldtbas /* Display base field of the current LDT */ 00011200H hlt> idtlim /* Display limit field of current LDT */ 00ffH hlt> tr /* Display selector field of current TR */ 0080H hlt> dslim = dslim + 35H /* Change limit of current data segment */ hlt> cs /* Display selector in CS register */ 025H hlt> csar /* Display the access rights bits as they */ 0bbH /* appear in the current CS register */
Summary
The commands shown in this article highlight just a few emulator debug features. Emulators also provide many other features, such as breaking emulation after a specific or nonspecific task switch, displaying the contents of page tables and the page directory, single-stepping high-level code, mixed assembly and high-level code, and others too numerous to mention.
Having an emulator present is not a substitute for understanding the protected-mode nature of the 386/486. However, an emulator does make it possible to easily access just about every possible architectural feature, making debugging protected-mode, multitasking programs almost as easy as working with real-mode code.