Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

Examining Instant-C


JUN90: EXAMINING INSTANT-C

Andrew Schulman is a software engineer who works on networking software for CD-ROM. He is a contributing editor of DDJ, and a coauthor of the book Extending DOS (edited by Ray Duncan, Addison-Wesley, May 1990), from which this article is adapted. Andrew can be reached at 32 Andrew St., Cambridge, MA 02139.


Instant-C is an interactive C compiler and integrated development environment from Rational Systems, based on Rational's DOS/16M, a protected-mode DOS extender for Intel 80286- and 80386-based PC compatibles. DOS/16M is also used in such products as Lotus 1-2-3, Release 3, AutoCAD, Release 10.0, and the DOS version of the Glockenspiel C++ compiler.

Instant-C(IC) provides interactive execution, linking, editing, and debugging of C code. In addition to loading .C files, C expressions can be typed in at IC's# prompt for immediate evaluation.

Figure 1 shows a sample session with IC. First, a buffer is allocated with malloc( ). This buffer happens to reside in extended memory, as shown by a call to the DOS/16M function D16AbsAddress( ). As with any product based on a DOS extender such as DOS/16M, however, the distinction between extended and conventional memory is largely unimportant: In protected mode, it's all just memory.

Figure 1: A sample session with Instant-C

  # char *p;
  # #include <malloc.h>
  MALLOC.H included
  # p = malloc(10240)
       address 03B8:01A6:" "
  # #include "dos16.h"
  DOS16.H included
  # D16AbsAddress(p)
      2940246 (0x2CDD56)
  # #include <dos.h>
  DOS.H included
  # int handle;
  # _dos_open("\\msc\\inc\\dos.h", 0, &handle)
     0
  # handle
     7
  # unsigned bytes;
  # _dos_read(handle p, 10240, &bytes)
     0
  # bytes
     5917 (0x171D)
  # _dos_close(handle)
     0
  # printf("%s\n", p)    //display file

Next, the low-level Microsoft C dos_open( ) function is used to open a file, and dos_read( ) is used to read the file into our buffer. We could just as easily use the C standard library functions fopen( ) and fread( ), but using these DOS-specific routines in conjunction with the extended-memory buffer shows how a DOS extender transparently manages the interface between MS-DOS and protected mode.

Note how C statements, declarations, and preprocessor statements can be freely mixed at the # prompt, somewhat like mixing statements and declarations in C++. In this "immediate mode," leaving the semicolon off a statement tells IC to print its value. This is one way that interactive C differs from "normal" C. In Figure 1, displayed the value of the variable bytes simply by typing its name.

IC provides a command language in the form of preprocessor statements. To compile a file FOO.C, for example, you could type #load foo.c at the IC prompt. The command language can also be used under program control:

  
if (x > 1)  
	_interpret("#load foo.c");

Working with a "quick" environment raises the issue of compatibility with production compilers. IC has a standard library, but for using the Microsoft C library instead, for example, IC comes with scripts to load MSC 5.1's real-mode large-model LLIBCE.LIB, load IC-supplied .LIB modules to replace the few Microsoft functions that won't work in protected mode, #include the Microsoft header files into IC, and then write out a new, large model, MSC-compatible IC. The new executable not only runs your C programs in protected mode under MS-DOS: It executes the Microsoft C library in protected mode as well. It seems like quite an accomplishment to load real-mode object code and execute it in protected mode, but this is standard procedure for DOS extenders.

IC gives C the interactive style of languages such as Forth and Lisp. Contrary to the stereotype of an interpreter, IC uses native object code. In fact, IC can dynamically load and link .OBJ and .LIB files, and can write out stand-alone .EXE files. Such stand-alone executables include a built-in protected-mode DOS extender.

The two major benefits of protected mode -- memory protection and a large address space -- mesh with the needs of a C development environment. The large address space (up to 16 Mbytes of memory) means that even very large C programs can be developed interactively. Hardware-based memory protection also helps insulate IC from bugs in user code and assists IC in finding bugs. An interpreter running in protected mode can off-load some of its type-checking onto the CPU.

IC is not only a product built using a DOS extender, it is an example of why "EXTDOS" is necessary in the first place. IC has been in existence since 1984. As more and more features were added to the product, it began to strain against the artificial 640K "Berlin Wall" of real-mode MS-DOS. Rational Systems developed DOS/16M for IC to cope with its expanding features and resulting expanding memory consumption. Thus, DOS/16M is based on IC, as much as IC is based on DOS/16M. For a short time, Rational Systems marketed a separate protected-mode IC/ 16M alongside real-mode IC. In October 1989, with IC Version 4.0, Rational discontinued the real-mode version.

Using Protection

A protected-mode interpreter must allow the user to violate the CPU's protection model without causing the interpreter itself to be shut down. I have discussed this issue at length in my two-part article, "Stalking GP Faults" (DDJ, January 1990 and February 1990). In IC we can freely type protection violations at the # prompt, as shown in Figure 2, because IC installs its own general-protection violation (GP fault) handler.

Figure 2: Detecting a protection violation

  # char *s;    //oops: forgot to initialize
  # atoi(s)
    ## 492: Invalid address 0AA8:4582 at __CATOX+000E
  # #reset

In addition to helping find bugs during development, the hardware-based memory protection of the Intel processors also can be put to work in the deliverable version of a product.

For example, functions often perform their own range checking. Each time the function is called, its parameters are checked against the size of the target object. But because the hardware does range and type checking in protected mode anyway, and because we pay a performance penalty for this checking, we should get the hardware to do our checking as well.

This requires devoting a separate selector to each object for which you want hardware-assisted checking. To see why, let's overstep the bounds of an array and see whether the Intel processor detects an off-by-one fencepost error. In Figure 3, the pointer p points to a block of 2013 bytes, numbered 0 through 2012, so p[2013] clearly oversteps its bounds. If protected mode is all it's cracked up to be, the CPU should have complained, right? Why didn't it?

Figure 3: This off-by-one error doesn't violate protection

  # char *p;
  # p = malloc(2013)
     address 03B8:01A6: " "
  # p[2013] = 'x'
      'x' (0x78)

The reason is that malloc( ) and other high-level-language memory allocators suballocate out of pools of storage. They do not ask the operating system for memory each time you ask them for memory, nor would you want them to. From one segment, malloc( ) may allocate several different objects. While we think of the object p as containing 2013 bytes, the processor sees a considerably larger object: The block of memory malloc( ) received the last time it asked DOS for memory. What size is the object the CPU sees?

  
# D16SegLimit(p)         
   24575 (0x5FFF)
   

If this explanation is correct, trying to poke p[0x5FFF] ought to cause a GP fault:

  
# p[0x5fff] = 'x'           
   ## 492: Invalid address OBF8:002A

Now, we still need a way to make the CPU see things our way. Because 80286 memory protection is based on segmentation, we must devote a separate selector to each object for which we want hardware-assisted checking. Notice I said "selector" and not "segment." We can continue to use malloc( ) to allocate objects, but when we want the CPU to know how big we think the object is, we provide an "alias" in the form of another selector that points at the same physical memory but whose limit is smaller. The segment limit indicates the highest legal offset within a block of memory and is checked by the CPU for each memory access.

To create an alias q for the pointer p, where q's limit is equivalent to the array bounds, we can use two other DOS/ 16M functions, as shown in Figure 4. This takes a physical address returned by D16AbsAddress( ), together with the limit we're imposing for access to this memory, and passes them to D16SegAbsolute( ), which constructs a protected-mode selector for the same absolute physical address but with a different limit.

Figure 4: Creating an alias with a shorter limit

  # char *q;
  # q = D16SegAbsolute(D16AbsAddress(p),
                                  2013);
      0C08:0000

In Figure 5, we verify that this worked. Attempting even to read from this invalid array index now causes a GP fault.

Figure 5: Now the off-by-one error does violate protection

  # q[2013]

    ## 492: Invalid address 0BF8:002A in command line
  # D16AbsAddress(p) = = D16AbsAddress(q)
      1
  # D16SegLimit(p) = = D16SegLimit(q)
      0
  # D16SegLimit(p)
      24575 (0x5FFF)
  # D16SegLimit(q)
      2012 (0x7DC)

So we can dispense with explicit bounds checking: The CPU will check for us. To control the error message displayed when a GP fault occurs, we could write our own INT OD handler and install it using the DOS set-vector function (INT 21 AH=25). Thus, instead of littering error checking throughout our code, protected mode allows us to centralize it inside an interrupt handler. Errors can be handled after the fact, rather than up front. In a way, this resembles ON ERROR, one of the more powerful concepts in Basic (which got it from PL/I).

This meshes with the advice given by advocates of object-oriented programming: "If you are expecting a sermon telling you to improve your software's reliability by adding a lot of consistency checks, you are in for a few surprises. I suggest that one should usually check less.... 'Defensive programming' is a dangerous practice that defeats the very purpose it tries to achieve" (Bertrand Meyer, "Writing Correct Software," Dr. Dobb's Journal, December 1989).

By using protection, you may be able to make an application run faster in protected mode than under real mode, because a lot of error-checking and "paranoia" code can now be made unnecessary.

When finished with the pointer p, it is important not only to free(p) but to release the alias in q. Don't use free( ) to release this selector, though: The C malloc( ) manager doesn't know anything about q, which is just an alias, a slot in a protected-mode descriptor table. We need to free this slot because the number of selectors available in protected mode is quite limited:

  
# free(p)   
# D16SegCancel(q)

In moving from real to protected mode, programmers may regret that segment arithmetic is so restricted. But the ability to create aliases, different views of the same block of physical memory, means that protected-mode selector manipulation is actually far more versatile than real-mode segment arithmetic.

The Intel 286 Protected-Mode Instructions

Transparency is a major goal of DOS extenders. But sometimes it is useful not to be so transparent. For example, DOS extender diagnostic programs and DOS extender utilities will generally be nonportable, hyper-aware that they are running in protected mode.

The 80286, and also the 286-compatible 386 and 486 chips, have a number of instructions that Intel provides primarily for use by protected-mode operating systems but which are also useful for utilities and diagnostic programs. Some of these are listed in Figure 6.

Figure 6: Some Intel protected-mode instructions

  LSL   (load segment limit) --size of a segment
  LAR   (load access rights) --access rights of segment
  VERR  (verify read) --can segment be peeked?
  VERW  (verify write) --can segment be poked?
  SGDT  (store GDT) --base address and size of GDT
  SIDT  (store IDT) --base addr and size of IDT
  SLDT  (store LDT) --selector to LDT

For example, in the last section we called D16SegLimit( ) to find the size of the segments pointed to by p and q. In operation (though not in implementation), D16SegLimit( ) corresponds to the LSL instruction, which takes a selector in the source operand and, if the selector is valid, returns its limit (size-1) in the destination operand. For example:

  lsl ax, [bp+6]

Similarly, the LAR instruction will load the destination operand with the "access rights" of the selector in the source operand if it contains a valid selector:

  lar ax, [bp+6]

The instructions LSL, LAR, VERR, and VERW are special because, even if the selector in the source operand is not valid, the instructions don't GP fault; instead, the zero flag is cleared. Therefore, if these instructions were available in a high-level language, we could construct protected-mode memory browsers and other utilities simply by looping over all possible selectors. This is an odd form of segment arithmetic:

  for (i=0; i<0xFFFF; i++)
      if lar(i) is valid
         print_selector(i)

It is easy to make the Intel protected-mode instructions available to C and other high-level languages, and they can be used interactively in IC. PROTMODE.ASM ( Listing One, page 120) is a small library of functions, including lsl( ) and lar( ), that can be assembled into PROTMODE.OBJ by using either the Microsoft Assembler (Version 5.0 and later) or Turbo Assembler.

PROTMODE.ASM uses the DOSSEG directive, which simplifies writing assembly-language subroutines and uses the ENTER and LEAVE instructions provided on the 80286 and higher for working with high-level-language stack frames. These execute a little slower than the standard BP-SP prolog/epilog but create compact source code.

PROTMODE.ASM provides nothing more than a functional interface to the Intel protected-mode instructions. While completely nonportable with real mode, this module is highly portable among 16-bit protected-mode systems (it would require some modification for use with a 32-bit DOS extender). Once assembled into PROTMODE.OBJ, it can be linked into any 16-bit protected-mode program, including an OS/2 program. It can be loaded into IC:

  #loadobj "protmode.obj"

You need to supply stub definitions for the individual routines in an object module loaded into IC. These look almost like declarations or function prototypes, except that they are followed by the construct {extern;}. PROTMODE.H (Listing Two, page 120) is a C #include file that contains function prototypes for use with IC (#ifdef InstantC) or with any other 16-bit protected-mode environment.

Now it's time to test the functions. Let's allocate a 10K segment, and see what limit lsl( ) returns. Figure 7 uses the FP_SEG( ) macro from Microsoft's dos.h to extract the selector from the pointer p, and passes this to lsl( ); lsl( ) returns 10,239, which is clearly the last legal offset within a 10K segment, so lsl( ) seems to work. (Actually, there is an extremely obscure bug in lsl( ). You should be able to spot it by looking back over Listing One.)

Figure 7. Verify that lsl( ) works

# char *p;
# p = D16MemAlloc(10240)
     address 0C08:0000
# lsl(FP_SEG(p))
     10239 (0x27FF)

The verw( ) function, like the VERW instruction, returns TRUE if a selector can be written to, or FALSE if the selector is read-only:

  
# verw(FP_SEG(p)) 
   1

We can use a DOS/16M function to mark this segment as read-only and then see if verw( ) has picked up on the change in the selector attributes:

  
# D16SegProtect(p, 1) 
   0    
# verw(FP_SEG(p)) 
   0

The read-only attribute, like other aspects of the protected-mode "access rights," applies to a selector, not to the underlying block of memory. One selector can be read-only and another read/write, while both correspond to the same physical memory.

Having tested the PROTMODE.OBJ routines we can, as promised, write a simple loop to display all valid selectors within our program. In IC, of course, we can just type this in at the # prompt, as shown in Figure 8.

Figure 8: Display all valid selectors

  unsigned i;
  for (i=0; i<0xFFFF; i++)      //for all possible selectors
     if (lar(i))                //if a valid selector
         printf("%04X\n", i);   //print selector

This will display all valid selectors within a protected-mode program (not just a DOS/16M program). But to be genuinely useful we need to print out some additional information about the selectors. In addition to using several of the functions in PROTMODE.ASM, the code in BROWSE.C (Listing Three, page 120) also performs some manipulations on the selector number itself: The bottom two bits are extracted with the expression i & 3, and the third bit is extracted with the expression i & 4.

What?! A protected-mode selector, unlike a real-mode segment number, has no necessary relation to the segment's physical location in memory. A protected-mode selector closely resembles a file handle. It is almost a "magic cookie," but not exactly, in that the number itself actually has semantic meaning: A selector is a record comprised of three fields. The bottom two bits contain a protection level, zero (most privileged) through three (least privileged). The third bit from the right contains a "table indicator" -- zero means the selector belongs to the Global Descriptor Table (GDT), and one means it belongs to the Local Descriptor Table (LDT) -- and the remaining 13 bits form an index into this table. Thus, when applied to a protected-mode selector i, i & 3 extracts the selector's protection level, and i & 4 tells whether the selector is located in the GDT or LDT.

Running under IC, a small part of the output from BROWSE.C is shown in Figure 9. The list runs on and on for quite a while. What is the value of this?

Figure 9: Output from BROWSE.C

0038  000000  LAR=93  LSL=FFFF  PL=00  VERR  VERW  GDT
003C  000000  LAR=93  LSL=FFFF  PL=00  VERR  VERW  LDT
0040  000400  LAR=93  LSL=0FFF  PL=00  VERR  VERW  GDT  TRANS
0044  000400  LAR=93  LSL=0FFF  PL=00  VERR  VERW  LDT
0048  034FE0  LAR=93  LSL=FFFF  PL=00  VERR  VERW  GDT
004C  034FE0  LAR=93  LSL=FFFF  PL=00  VERR  VERW  LDT
0050  110010  LAR=93  LSL=200F  PL=00  VERR  VERW  GDT
0054  110010  LAR=93  LSL=200F  PL=00  VERR  VERW  LDT
0058  032050  LAR=81  LSL=0067  PL=00              GDT
005C  032050  LAR=81  LSL=0067  PL=00              LDT
0060  FA0000  LAR=93  LSL=FFFF  PL=00  VERR  VERW  GDT
0064  FA0000  LAR=93  LSL=FFFF  PL=00  VERR  VERW  LDT
0068  100010  LAR=82  LSL=FFF8  PL=00              GDT

In contrast to real mode where every address you can form points somewhere, protected-mode memory is a sparse matrix. At any given time, most segment:offset combinations are not valid addresses: Dereferencing them causes a protection violation. Producing a list like this gives us an idea of the memory organization of a DOS extender program.

From this list, we can see that while protected-mode memory is a sparse matrix, it's not so sparse under DOS/16M as under OS/2. We can also see that all the entries are marked PL=00, indicating that everything is running at Ring 0. To double-check that this is so, the loop in Figure 10 represents the query, "Are any segments not at Ring 0?". Under IC, this produces no output: Everything is running at the most privileged protection level. But in OS/2, most of an application program's selectors would be displayed by this loop. This is one of the differences between DOS/16M and a full-blown protected-mode operating system such as OS/2. Because DOS/16M is just a shell to support one program at a time in protected mode, Rational Systems chose not to establish different protection levels.

Figure 10: Are any selectors not at Ring 0?

  for (i=0; i<0xFFFF; i++)
      if (lar(i) && (i & 3))
         printf("%04X PL=%02X\n", i, i & 3);

Along the same lines, the selectors you'll use in IC or in DOS/16M actually refer not to your program's LDT, but to the GDT. Because there is only one program running, the distinction between GDT and LDT, while crucial in a multitasking operating system such as OS/2, is fairly artificial in the "one program at a time" world of DOS/16M.

On the other hand, another DOS extender, Eclipse Computer Solution's OS/286, while sharing many of the same goals as DOS/16M, makes a sharper distinction between the kernel (the OS/286 DOS extender itself) and the program supported by the DOS extender. OS/286 programs run at Ring 3, while OS/286 itself runs at Ring 0. This just shows that there are few fixed rules about how a DOS extender must be organized. Protected mode allows for a wide variety of styles in operating environments.

IC requires a large GDT partially to support many "transparent" selectors. For example, selector 0x40 has a physical base address of 0x400, corresponding to the BIOS data area. Using the same code from PROTMODE.ASM, it is trivial to form the query, "Which selectors are transparent?" (Figure 11.)

Figure 11: Are any selectors transparent?

  for (i=0; i<0xFFFF; i++)
      if (lar(i) && (i == D16Abs Address
                           (MK_FP(i,0)) >> 4))
         printf("%04X", i);

Examining the Protected-Mode Descriptor Tables

We have already used an indirect method to examine the DOS/16M memory map: Loop over all possible selectors and see if they're legal. We can also directly examine the GDT, IDT, and LDT.

PROTMODE.ASM contains a functional interface to the SGDT instruction. SGDT expects a pointer to 6 bytes of storage (a FWORD PTR), into which it copies the contents of the CPU's GDT register (GDTR). The GDTR holds the 24-bit physical base address and 16-bit limit of the GDT, corresponding to the C structure in Figure 12. (Note that this, like most structures in this article, requires byte alignment; in IC, _struct _alignment = 1; in batch compilers such as Microsoft C, use #pragma pack (1).) This structure, along with sgdt( ), is used in Figure 13 to get the physical base address (0x100010) and limit (0xFFF8) of the GDT.

Figure 12: The GDTR represented in C

  typedef struct {
      unsigned      limit, lo;
      unsigned char hi, reserved;
      } GDTR;

Figure 13: Finding the location and size of the GDT

  # GDTR g;
  # sgdt(&g)
  # g
           struct at 2F1C {
           limit = 65528 (0xFFF8);
           lo = 16 (0x10);
           hi = '\020' (0x10);
           reserved = '\0';}

Now we need to map this into our address space. A protected-mode descriptor table is an array of 8-byte segment descriptors. Each descriptor contains the 24-bit physical base address and 16-bit limit for the segment, as well as an access-rights byte. There is also a 2-byte field used in 32-bit protected mode on the 386. All this can be expressed in C, as shown in Figure 14. After typing or loading this structure definition into IC, we can create a pointer to the GDT (Figure 15).

Figure 14: The protected-mode descriptor represented in C

  typedef struct {
     unsigned      limit;    //size minus 1
     unsigned      addr_lo;  //physical base addr - paragraph.byte
     unsigned char addr_hi;  //physical base addr - megabyte
     unsigned char access;   //see ACCESS_RIGHTS below
     unsigned      reserved; //for 386 (32-bit)
     } DESCRIPTOR;

Figure 15: Mapping the GDT into our address space

# DESCRIPTOR *gdt; //GDT is array of DESCRIPTOR
# gdt = D16SegAbsolute((long) MK_FP(g.hi, g.lo), g.limit + 1)
      address 0C08:0000

Now that we have a pointer to the GDT, let's make it read-only to make sure we don't mess anything up (though, if you were working in a protected-mode environment that didn't have convenient functions for changing selector attributes, you might actually want to write to the GDT!):

# D16SegProtect(gdt, 1);

If bit 3 of a selector indicates that it belongs to the GDT, then the top 13 bits of the selector can be used as an index into the GDT. Take the example of the GDT pointer itself. In Figure 16, we locate our new descriptor for the GDT within the GDT. (Got it?)

Figure 16: Finding the GDT selector within the GDT

#gdt[FP_SEG(gdt) >> 3]
      struct at 0C08:0C08 {
         limit = 65527 (0xFFF7);
         addr_lo = 16 (0x10);
         addr_hi = '\020' (0x10);
         access = 'Q' (0x91);
         reserved = 0;}

Figure 16 confirms what we already know about the GDT: Its physical base address is 0x100010 and its limit is 0xFFF7. We could now dispense with D16SegAbsolute( ) and D16SegLimit( ), and write portable protected-mode code.

To get a pointer to the GDT generally requires that you use some special facility within your protected-mode environment. We used D16SegAbsolute( ) here, which obviously won't work outside DOS/16M. However, once you do have a pointer to the GDT, you can write completely portable protected-mode code. For example, I will snarf a lot of this code for a forthcoming DDJ article that features a GDT browser for OS/2.

What about the "access rights" value that the CPU in protected mode uses to ensure proper use of selector 0x0C08? We can use the C bitfield in Figure 17 to display the individual fields that make up the access-rights value 0x91. The C bit field structure is wonderfully nonportable, so if using the structure in Figure 17, you should check your compiler's ordering of bit fields and make sure the structure is byte aligned.

Figure 17: The protected-mode access-rights byte in C

  typedef struct access {
  unsigned accessed     :1;   //has segment been accessed?
  unsigned read_write   :1;   //if data 1=write; if code 1=read
  unsigned conf_exp     :1;   //expansion direction
  unsigned code_data    :1;   //0 = data, 1 = code
  unsigned xsystem      :1;   //0 = system descriptor
  unsigned dpl          :2;   //protection level: 0:.3
  unsigned present      :1;   //is segment in memory?
  } ACCESS_RIGHTS;

  #*((ACCESS_RIGHTS*) &gdt[FP_SEG(gdt) >>3].access)
                struct access at 0C08:0C0D {
            accessed : 1 = 1;        //it's been used
            read_write : 1 = 0;      //it's ready-only
            conf_exp : 1 = 0;        //it's not a stack
            code_data : 1 = 0;       //it's data
            xsystem : 1 = 1;         //it's not a system descriptor
            dpl : 2 = 0;             //protection level 0
            present : 1 = 1;}        //it's present in memory

While DOS/16M (and, consequently, IC) doesn't make much use of the LDT, this table is crucial in other protected-mode environments. Getting the LDT selector is simple:

  
DESCRIPTOR far *ldt;   
ldt = MK_FP(sldt(), 0);

If this pointer is not valid within your address space (!verr(sldt())), you can instead look up your LDT's descriptor within the GDT:

  
gdt[sldt( ) >> 3]

and then map the absolute address you find there into your address space.

Now that we have these structures, we can write a function, sel( ), to display selector attributes. Note that in Listing Four, page 120, (SEL.C) contains no references to DOS/16M. sel( ) can be used to examine the selector for any pointer, such as sel( )'s own function pointer. The attributes display indicates that this is readable code running at protection level zero:

  
# sel(sel)   
SEL=OA68 ADDR=472BB0    
	LIMIT=43FF ACCESS=Oar-c-p

All these data structures are described in the Intel literature on 286 and 386 protected mode. Seeing them come to life in IC, though, is a great aid to understanding protected mode.

One of the tricks of protected-mode programming is to acquire an in-depth knowledge of these data structures and then, when programming, to forget about them. The operating environment takes care of maintaining the GDT, the descriptors within the GDT, and the access-rights bytes within the descriptors. The CPU will take care of using these data structures to maintain the integrity of the system. You're better off not thinking too closely about them, but it does seem to help to have been familiar with them at some point or other. Having an interactive environment like IC is a great aid to gaining this familiarity.

Product Information

Instant-C Rational Systems Inc. 220 N. Main St. Natick, MA 01760 508-653-6006 Requires DOS 2.0 or above an 80286 or 80386 CPU and at least 1 Mbyte of memory. Works with medium and large memory models. Price: $795

_EXAMINING INSTANT-C_ by Andrew Schulman

[LISTING ONE]

<a name="0148_001a">

;       protmode.asm -- 286 protected-mode instructions
;       requires MASM 5.0 or higher or TASM
;       masm -ml protmode;
;       or, tasm -ml protmode;

        dosseg

        .286p
        .model large
        .code

        public  _lsl, _lar, _verr, _verw, _sgdt, _sidt, _sldt

;       extern unsigned far lsl(unsigned short sel);
;       input:   selector
;       output:  if valid and visible at current protection level,
;                   return segment limit (which is 0 for 1-byte seg!)
;                else
;                   return 0
;
_lsl    proc
        enter   0, 0
        sub     ax, ax
        lsl     ax, [bp+6]
        leave
        ret
_lsl    endp

;       extern unsigned short far lar(unsigned short sel);
;       input:   selector
;       output:  if valid and visible at current protection level,
;                   return access rights (which will never be 0)
;                else
;                   return 0
;
_lar    proc
        enter   0, 0
        sub     ax, ax
        lar     ax, [bp+6]
        shr     ax, 8
        leave
        ret
_lar    endp

;       extern BOOL far verr(unsigned short sel);
;       input:   selector
;       output:  valid for reading ? 1 : 0
;
_verr   proc
        enter   0, 0
        mov     ax, 1
        verr    word ptr [bp+6]
        je      short verr_okay
        dec     ax
verr_okay:
        leave
        ret
_verr   endp

;       extern BOOL far verw(unsigned short sel);
;       input:   selector
;       output:  valid for writing ? 1 : 0
;
_verw   proc
        enter   0, 0
        mov     ax, 1
        verw    word ptr [bp+6]
        je      short verw_okay
        dec     ax
verw_okay:
        leave
        ret
_verw   endp

;       extern void far sgdt(void far *gdt);
;       input:   far ptr to 6-byte structure
;       output:  fills structure with GDTR
;
_sgdt   proc
        enter 0, 0
        les   bx, dword ptr [bp+6]
        sgdt  fword ptr es:[bx]
        leave
        ret
_sgdt   endp

;       extern void far sidt(void far *idt);
;       input:   far ptr to 6-byte structure
;       output:  fills structure with IDTR
;
_sidt   proc
        enter 0, 0
        les   bx, dword ptr [bp+6]
        sidt  fword ptr es:[bx]
        leave
        ret
_sidt   endp

;
;       extern unsigned short sldt(void);
;       input:   none
;       output:  Local Descriptor Table register (LDTR)
;
_sldt   proc
        sldt  ax
        ret
_sldt   endp

        end




<a name="0148_001b"><a name="0148_001b">
<a name="0148_001c">
[LISTING TWO]
<a name="0148_001c">

/* PROTMODE.H */
typedef enum { FALSE, TRUE } BOOL;
#ifdef InstantC
unsigned far lsl(unsigned short sel)        {extern;}
unsigned short far lar(unsigned short sel)  {extern;}
BOOL far verr(unsigned short sel)           {extern;}
BOOL far verw(unsigned short sel)           {extern;}
void far sgdt(void far *gdt)                {extern;}
void far sidt(void far *idt)                {extern;}
unsigned short sldt(void)                   {extern;}
#else
extern unsigned far lsl(unsigned short sel);
extern unsigned short far lar(unsigned short sel);
extern BOOL far verr(unsigned short sel);
extern BOOL far verw(unsigned short sel);
extern void far sgdt(void far *gdt);
extern void far sidt(void far *idt);
extern unsigned short sldt(void);
#endif



<a name="0148_001d"><a name="0148_001d">
<a name="0148_001e">
[LISTING THREE]
<a name="0148_001e">

/* BROWSE.C */

#ifdef InstantC
#loadobj "protmode.obj"
#endif
#include "protmode.h"

void browse()
{
    unsigned long addr;
    unsigned i, acc;
    for (i=0; i<0xFFFF; i++)        // for all possible selectors
        if (acc = lar(i))           // if a valid selector
        {
            addr = D16AbsAddress(MK_FP(i,0));
            printf("%04X %06lX LAR=%02X LSL=%04X PL=%02X %s %s %s %s\n",
                i,                              // selector
                addr,                           // physical base addr
                acc,                            // access-rights byte
                lsl(i),                         // segment limit
                i & 3,                          // protection level
                verr(i) ? "VERR" : "    ",      // readable?
                verw(i) ? "VERW" : "    ",      // writeable?
                i & 4 ? "LDT" : "GDT",          // which table?
                i == addr >> 4 ? "TRANS" : ""); // transparent?
        }
}




<a name="0148_001f"><a name="0148_001f">
<a name="0148_0020">
[LISTING FOUR]
<a name="0148_0020">

/* SEL.C */

void sel(void far *fp)
{
    extern DESCRIPTOR far *gdt;
    extern DESCRIPTOR far *ldt;
    unsigned seg = FP_SEG(fp);
    unsigned index = seg >> 3;
    DESCRIPTOR far *dt = (seg & 4) ? gdt : ldt; // table indicator

    ACCESS_RIGHTS *pacc = (ACCESS_RIGHTS *) &dt[index].access;
    printf("SEL=%04X ADDR=%02X%04X LIMIT=%04X ACCESS=%d%c%c%c%c%c%c\n",
        seg, dt[index].addr_hi, dt[index].addr_lo, dt[index].limit,
        // display access rights as if they were file attributes:
        pacc->dpl,
        pacc->accessed ? 'a' : '-',
        pacc->read_write ? ((pacc->code_data) ? 'r' : 'w') : '-',
        pacc->conf_exp ? ((pacc->code_data) ? 'f' : 'e') : '-',
        pacc->code_data ? 'c' : 'd',
        pacc->xsystem ? '-' : 's',
        pacc->present ? 'p' : '-');
}


[FIGURE 1]

        # char *p;
        # #include <malloc.h>
        MALLOC.H included
        # p = malloc(10240)
            address 03B8:01A6: ""
        # #include "dos16.h"
        DOS16.H included
        # D16AbsAddress(p)
            2940246 (0x2CDD56)
        # #include <dos.h>
        DOS.H included
        # int handle;
        # _dos_open("\\msc\\inc\\dos.h", 0, &handle)
            0
        # handle
            7
        # unsigned bytes;
        # _dos_read(handle, p, 10240, &bytes)
            0
        # bytes
            5917 (0x171D)
        # _dos_close(handle)
            0
        # printf("%s\n", p)     // display file



[FIGURE 2]

        # char *s;  // oops: forgot to initialize
        # atoi(s)
         ## 492: Invalid address 0AA8:4582 at __CATOX+000E
        # #reset


[FIGURE 3]

        # char *p;
        # p = malloc(2013)
            address 03B8:01A6: ""
        # p[2013] = 'x'
            'x' (0x78)


[FIGURE 4]

        # char *q;
        # q = D16SegAbsolute(D16AbsAddress(p), 2013);
            0C08:0000

[FIGURE 5]

        # q[2013]
         ## 492: Invalid address 0BF8:002A in command line
        # D16AbsAddress(p) == D16AbsAddress(q)
            1
        # D16SegLimit(p) == D16SegLimit(q)
            0
        # D16SegLimit(p)
            24575 (0x5FFF)
        # D16SegLimit(q)
            2012 (0x7DC)


[FIGURE 6]


        LSL  (load segment limit) -- fetch size of a segment
        LAR  (load access rights) -- fetch access rights of segment
        VERR (verify read) -- can segment be peeked?
        VERW (verify write) -- can segment be poked?
        SGDT (store GDT) -- fetch base address of GDT
        SIDT (store IDT) -- fetch base addr of IDT
        SLDT (store LDT) -- fetch selector to LDT


[FIGURE 7]

        # char *p;
        # p = D16MemAlloc(10240)
            address 0C08:0000
        # lsl(FP_SEG(p))
            10239 (0x27FF)

[FIGURE 8]

        unsigned i;
        for (i=0; i<0xFFFF; i++)      // for all possible selectors
            if (lar(i))               // if a valid selector
                printf("%04X\n", i);  // print selector


[FIGURE 9]

        0038 000000 LAR=93 LSL=FFFF PL=00 VERR VERW GDT
        003C 000000 LAR=93 LSL=FFFF PL=00 VERR VERW LDT
        0040 000400 LAR=93 LSL=0FFF PL=00 VERR VERW GDT TRANS
        0044 000400 LAR=93 LSL=0FFF PL=00 VERR VERW LDT
        0048 034FE0 LAR=93 LSL=FFFF PL=00 VERR VERW GDT
        004C 034FE0 LAR=93 LSL=FFFF PL=00 VERR VERW LDT
        0050 110010 LAR=93 LSL=200F PL=00 VERR VERW GDT
        0054 110010 LAR=93 LSL=200F PL=00 VERR VERW LDT
        0058 032050 LAR=81 LSL=0067 PL=00           GDT
        005C 032050 LAR=81 LSL=0067 PL=00           LDT
        0060 FA0000 LAR=93 LSL=FFFF PL=00 VERR VERW GDT
        0064 FA0000 LAR=93 LSL=FFFF PL=00 VERR VERW LDT
        0068 100010 LAR=82 LSL=FFF8 PL=00           GDT


[FIGURE 10]

        for (i=0; i<0xFFFF; i++)
            if (lar(i) && (i & 3))
                printf("%04X PL=%02X\n", i, i & 3);

[FIGURE 11]

        for (i=0; i<0xFFFF; i++)
            if (lar(i) && (i == D16AbsAddress(MK_FP(i,0)) >> 4))
                printf("%04X ", i);


[FIGURE 12]

        typedef struct {
            unsigned      limit, lo;
            unsigned char hi, reserved;
            } GDTR;


[FIGURE 13]

        # GDTR g;
        # sgdt(&g)
        # g
           struct  at 2F1C {
              limit = 65528 (0xFFF8);
              lo = 16 (0x10);
              hi = '\020' (0x10);
              reserved = '\0';}


[FIGURE 14]

        typedef struct {
            unsigned      limit;    // size minus 1
            unsigned      addr_lo;  // physical base addr - paragraph.byte
            unsigned char addr_hi;  // physical base addr - megabyte
            unsigned char access;   // see ACCESS_RIGHTS below
            unsigned      reserved; // for 386 (32-bit)
            } DESCRIPTOR;


[FIGURE 15]

        # DESCRIPTOR *gdt;      // GDT is array of DESCRIPTOR
        # gdt = D16SegAbsolute((long) MK_FP(g.hi, g.lo), g.limit + 1)
            address 0C08:0000


[FIGURE 16]

        # gdt[FP_SEG(gdt) >> 3]
           struct  at 0C08:0C08 {
              limit = 65527 (0xFFF7);
              addr_lo = 16 (0x10);
              addr_hi = '\020' (0x10);
              access = 'Q' (0x91);
              reserved = 0;}


[FIGURE 17]

        typedef struct access {
            unsigned accessed   : 1;  // has segment been accessed?
            unsigned read_write : 1;  // if data 1=write; if code 1=read
            unsigned conf_exp   : 1;  // expansion direction
            unsigned code_data  : 1;  // 0 = data, 1 = code
            unsigned xsystem    : 1;  // 0 = system descriptor
            unsigned dpl        : 2;  // protection level: 0..3
            unsigned present    : 1;  // is segment in memory?
            } ACCESS_RIGHTS;

        # *((ACCESS_RIGHTS *) &gdt[FP_SEG(gdt) >> 3].access)
           struct access at 0C08:0C0D {
              accessed : 1 = 1;       // it's been used
              read_write : 1 = 0;     // it's read-only
              conf_exp : 1 = 0;       // it's not a stack
              code_data : 1 = 0;      // it's data
              xsystem : 1 = 1;        // it's not a system descriptor
              dpl : 2 = 0;            // protection level 0
              present : 1 = 1;}       // it's present in memory










Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.