UNDOCUMENTED DOS
Uncover the mysteries behind undocumented DOS calls to break the 32-Mbyte drive partition barrier
Rahner James
Rahner James is an independent consultant living near Sacramento, Calif. He can be reached by calling 916-722-1939 or through CompuServe 71450, 757.
MS-DOS is a fairly well-documented operating system. A person only needs to browse through the computer section of their local bookstore to get an idea about how much has been written about MS-DOS. Powerful manuals for powerful programmers developing powerful tools for powerful users.
After readers have shaken off the giddiness from all of that power going to their heads, they begin to notice evidence of a mystery -- there are gaps in the numbering sequence for system calls. At first glance, it's as if Rosemary Woods went to work for Microsoft after her White House experience. At second glance, an interested party can see that DOS does indeed respond to the numbers that fill the gaps, but the purpose of those numbers may be unclear. This article examines one of those undocumented system calls, DOS function 52h, which returns a pointer to an internal buffer that contains pointers to various DOS structures. Once I've covered the system call, I'll show you how it can be used to create drive partitions greater than 32 Mbyte. I have also included a subroutine that can be added to any DOS block device driver to allow for the larger drive partitions.
Using Undocumented Features
The first reaction to the idea of using any undocumented commands should probably be one of skepticism. Why would anyone use an undocumented feature in their program? If the feature is undocumented, then the manufacturer has no reason to change it from version to version. Any program that uses such a call would have to be changed every time a new version of DOS is released! In theory, this is true -- but when marketing realities meet with theory, the bigger word wins.
The fact is that most undocumented calls were written to allow upper layers of the operating system to access the lower layers, while still maintaining a standard interface. Because it is easier and more economical to continue using old system call structures, there is a high probability of little or no change to the major undocumented calls.
The undocumented system calls usually deal with low-level operating system data structures. Normally, when a programmer wants low-level partition information that cannot be found through the standard calls, he or she writes a routine to read the partition boot record from the device, and then gleans the required data from the record. Unfortunately, a DOS block device is not required to have a partition boot record, and any assumption about the data structures on the first logical sector are bound to be wrong in several cases. Any data structure that is left up to the discretion of the developer should be considered more troublesome than a data structure that has been defined and used by Microsoft -- if the developers are given structural leeway by Microsoft, it should be assumed that they will take it.
MS-DOS Function 52h
DOS function 52h is probably the most useful of the undocumented system calls because it contains a great deal of information. The purpose of this function call is to get the internal buffer pointers. The call returns a pointer to a table of pointers and data. This table describes most of the DOS internal structures that are associated with the storage subsystem. Figure 1 describes the calling sequence used in calls to DOS function 52h.
The function returns a far pointer to an internal DOS buffer. The structure of this buffer differs depending upon the major version of DOS. Although I have not tested it, function 52h is probably not supported in Version 1.xx.
Table 1 shows the buffer structure for MS-DOS, Version 2.xx.
Table 1: The buffer structure for MS-DOS, Version 2
Offset Size Description ------------------------------------------------------------------------ -2 word Segment value of the first memory block available through the Allocate Memory call. 0 dword Far pointer to the first MS-DOS Disk Parameter Block. This structure is defined after this table as the Disk Parameter Structure. 4 dword Far pointer to a linked list of MS-DOS open file tables. This structure is defined after this table as the Open File Table List. 8 dword Far pointer to the beginning of the CLOCK$ device driver. 0Ch dword Far pointer to the beginning of the CON: device driver. 10h byte Contains the number of logical drives presently supported by the system. 11h word This contains the size of the largest logical sector size supported by the system. 13h dword Far pointer to the first sector buffer structure used by the logical disks. The size of each sector buffer is equal to the largest logical sector size plus a sixteen byte header. This header is defined after this table as the Sector Buffer Structure. The number of these buffers is defined by the 'BUFFERS=xx' entry in the CONFIG.SYS file. 17h This is the start of the first device driver (NUL).
The buffer structure for Version 3.xx is identical to 2.xx up to offset 10h. The buffer structure for 3.xx, beginning with byte 10h, is shown in Table 2.
Table 2: The buffer structure for MS-DOS, Version 3, beginning with byte 10h
Offset Size Description --------------------------------------------------------------- 10h word Contains the size of the largest logical sector size supported by the system. 12h dword Far pointer to first sector buffer structure used by the logical disks. This is defined after this table as the Sector Buffer Structure. 16h dword Far pointer to the drive path and seek information table. This structure is defined after this table as the Drive Path Table. 1Ah dword Far pointer to a table of FCBs. This table is only valid if the "FCBS=xx" command is used in CONFIG.SYS. 1Eh word Size of FCB table. 20h byte Contains the number of logical drives presently supported by the system. 21h byte The value of "LASTDRIVE=xx" in CONFIG.SYS. The default value is 5. 22h This is the start of the first device driver (NUL).
Disk Parameter Structure
Each disk parameter block entry contains basic information about the logical disk drives attached to the DOS system. The structure of the disk Parameter Block entry is similar to the structure of the BIOS Parameter Block (BPB) described in the DOS Technical Reference Manual. The use of the BIOS Parameter Block structure to obtain information about the physical constraints of the disk storage subsystem has several advantages.
In several of the disk-related applications that I have worked on over the years, the programmer reads the Disk Parameter Block off logical partition sector 0 in order to determine the physical characteristics of the disk partition. The assumption was that all block devices followed the IBM partition structure definition. Unfortunately, that definition is not a standard, and turns out to be version- and implementation-dependent. In contrast, the disk parameter structure has not changed since MS-DOS, Version 2.00 (although no testing has been done on Version 4.00), and contains all of the necessary information (short of the physical disk drive characteristics).
Another advantage of using the disk parameter structure is that some of the other DOS calls that are used to obtain logical drive information access the drive for readiness. In some cases, the normal state for the drive is to be not ready. If the device is not ready, repeated retries cause a delayed call return, and extra programming is necessary in order to handle the unintentional error that is returned.
In addition to providing informational advantages, the disk parameter structure offers an additional feature that is not available through any other system call -- the "media flag." The media flag indicates whether the device has been read. If the programmer sets the flag to - 1, DOS forces a reread of basic device information from the device. This approach is particularly useful if the programmer modifies the FAT and wants DOS to update its internal structures with the updated table.
The structure for the disk parameter block is shown in Table 3.
Table 3: The structure for the disk parameter block
Offset Size Description ---------------------------------------------------------------------- 0 byte Disk unit number, basically this is the drive number zero based (0=A, 1=B, etc.). If this and the next byte are 0FFh then this entry is the end of the linked list and invalid. 1 byte Disk unit number passed to the block device driver responsible for this logical drive. 2 word The logical sector size of this drive in bytes. 4 byte Number of sectors per cluster - 1. The number of sectors per cluster must be a power of two. 5 byte Allocation shift. It is the shift value used to calculate number of sectors from number of clusters without having to use division. Number of sectors = number of clusters << allocation shift. 6 word Reserved sectors. This is the number of reserved sectors at the beginning of the logical drive. The reserved sectors may include partition information. 8 byte Number of FATs. This is usually two. 9 word Number of root directory entries. The root is the only directory that has a structural limitation on the number of entries. This is usually 32 factored into some multiple of the logical sector size. 0Bh word First data sector. This is the first sector that contains file data. This is also defined as cluster two since cluster zero and one have special definitions. Conceivably, there could be a hidden area between the end of the root directory and the first data sector. 0Dh word Last cluster number. This is the number of clusters in the data area + 1. If this is less than 0FF6h then the FAT uses 12-bit entries; otherwise, it uses 16-bit entries. 0Fh byte FAT size. This the size of one FAT in logical sectors. 10h word First root sector. This is the sector number of the first root directory entry. 12h dword Far pointer to the block device driver. 16h byte Media descriptor. This is the media descriptor byte as described in the MS-DOS Technical Reference Manual. 17h byte Media flag. If this is 0, the drive has been accessed. If it is -1 (or is set to - 1), MS-DOS will build all its internal data structures concerning the drive when it is next accessed. 18h dword Far pointer to the next disk parameter block.
Open File Table List
DOS stores all of its information about open files in a linked list of tables. Each node in the linked list is a table of file entries that contain information about each open file. The first table always contains the five predefined I/O handles. The linked list of tables is not particularly useful, because the information that it contains can be gleaned from other, more standard system calls. The list is included here for the sake of completeness (though even that could be questioned since it is not completely defined). The structure for the Open File Table List is shown in Table 4, and the structure for an Open File Table entry is shown in Table 5.
Table 4: The structure for the Open File Table List
Offset Size Description ------------------------------------------------------------------------ 0 dword Far pointer to the next table in the list. If the offset of this pointer is 0FFFFh, then the table is the final entry and invalid. 4 word Number of table entries. Each table entry is 53 bytes long. There will be at least one entry in each table except the terminal entry. 6 The open file table entries begin here.
Table 5: The structure for an Open File Table entry
Offset Size Description ------------------------------------------------------------------------ 0 word The number of file handles that refer to this table entry. 2 byte Not defined. [5] 6 dword Far pointer to the device driver header if this is a character device. If it is a block device, this will be a far pointer to the Disk Parameter Block. 0Ah byte Not defined. [21] 20h byte This is the filename as it would appear on the diskspace [11] filled and with no period to separate the name from the extension. 2Bh word This is the PSP segment number of the file's owner. 2Dh dword Not defined. 31h word Not defined. 33h word Not defined.
Sector Buffer Structure
The number of sector buffers that DOS uses is predefined at boot time. Each sector buffer consists of a 16-byte header followed by an area equal to the size of the largest logical sector. The header contains information about the origin of the data in the buffer, and a link to the next sector buffer in line. The fact that this linked buffer list is not hardcoded, and that this system call's pointer is the primary reference to this area, creates some interesting possibilities (as you'll soon see). The structure for the header of the sector buffer is shown in Table 6.
Table 6: The structure for the header of the sector buffer
Offset Size Description ------------------------------------------------------------------------ 0 dword Far pointer to the next sector buffer. The buffers are filled in order of their appearance in this linked list. The last buffer has a 0FFFFFFFFh in this location and is a valid buffer. 4 byte Drive number. This is the drive number that the data currently in this buffer was read/written from/to. If this buffer has not been used, this byte will be set of 0FFh. 5 byte Data type flags. The bit fields show what area of the drive that the data is associated. The bits are defined as follows: Bit Description ------------------------------------------------------------------------ 1 File Allocation Table data 2 Directory or subdirectory data 3 File data 6 word Logical sector number. This is the logical sector number that this structure has buffered. 8 word Access number. 0Ah dword Far pointer to the disk parameter block. 0Eh word Unused.
Drive Path Table
The Drive Path Table contains the default path, drive head location, and various flags and pointers. As with the Open File Table List, all of the information in the Drive Path Table is also available through other documented system calls. The number of table entries is equal to the number of valid logical drives plus one. The last entry contains a zero in the flags variable and does not have any useful data. The structure for a Drive Path Table entry is shown in Table 7.
Table 7: The structure for a Drive Path Table entry
Offset Size Description ------------------------------------------------------------------- 0 byte Contains the current default pathname in ASCIIZ [64] format. It contains the drive letter, colon delimiter and the initial '\'. For example, the string for the second table entry with default path of 'my_path' would be B:\MY_PATH. 40h dword Reserved, set to 0. 44h byte Flags variable. All valid entries contain a 40h; the last entry contains 0. 45h; dword Far pointer to the Drive Parameter Block. 49h word Current block or track/sector number for this drive. 4Bh dword Far pointer. Unknown purpose. 4Fh word Unknown storage.
As you can see, DOS function 52h holds a considerable amount of information. Virtually everything that the programmer needs to know about the disk is accessible through this call.
Creating Expanded Partition Sizes
Since the advent of the inexpensive hard disk, the DOS 32-Mbyte disk partition limitation has been a real problem. Until the joint Microsoft/Compaq "BIGFOOT" development on Version 3.31 and the later 4.00 release, Microsoft had not offered any remedies to this deficiency. To fill the marketing requirement, several companies have released hard disk partitioning software packages that allow DOS to access partitions up to 1 gigabyte.
In theory, MS-DOS, Versions 2.xx and 3.xx, can access disk partitions up to 4 gigabytes simply by increasing the sector size from 512 bytes to some larger power of two. In reality, a block device driver written to test this theory, simply dies. The artificial limitation stems from the buffer initialization at boot time, when somewhere in the recesses of DOS, an assumption is made that the basic sector size is going to be 512 bytes. Because all of the buffers are statically allocated according to that assumption, no sector size larger then 512 bytes are allowed. Obviously, since there are programs that allow larger partitions, there must be a way to circumvent the assumption. In fact, there are a couple of ways.
One of the easier routines that you can build allows a DOS block device driver to access up to 2 gigabytes per partition and uses two of the items provided by function 52h. The first item is the maximum sector size variable -- the routine simply needs to change that variable to the maximum sector size supported by the driver. The second item provided by function 52h is the linked list of sector buffers. If a sector is read into a buffer that is too small to hold that sector, catastrophic results and bit death usually occur. To avoid the overwriting of useful data, the sector buffer linked list must be redone.
Expanding the Partition Horizon
I have written a small routine in 8086 assembly language that can be included in any MS-DOS block device driver that supports logical sector sizes above 512 bytes (see Listing One). These days DOS block device drivers are commonplace, so a thorough explanation of how this type of device driver works is unnecessary. But, a few relevant points should be reiterated before jumping right into the routine.
Normal block device drivers have a basic data block size that is some power of two. The basic block size, usually one 512-byte sector, is defined in the BIOS Parameter Block provided by the driver as part of the initialization process. Because the BPB storage variable for the number of blocks in the partition is 16 bits, a partition cannot be larger than 65,535 logical blocks. This limitation means that in order to increase the storage capacity beyond 32 Mbytes, the block size must be increased beyond 512 bytes.
Most controllers do not support sector sizes other than 512 bytes. Rather than expecting the hardware to be the answer to larger logical sectors, the easiest way and the most generally applicable is to just read more 512-byte sectors. For example, to get a block size of 1024 (partition size of 64 Mbytes), read two sequential 512-byte sectors. To get a block size of 32,768 (partition size of two gigabytes), read 64 sequential 512-byte sectors. A well designed block device driver should easily accommodate this method.
After the block device driver initializes all the devices and their partition structures, the BPB table should be complete. The largest block size can be found by scanning the valid BPB array entries. This maximum block size is required by the MODIFY_DOS routine in a word-size variable called BIGGEST_SECTOR.
After getting the vector from DOS function 52h, a comparison is made between the driver's largest block size and the largest block size supported by DOS at the moment. If DOS already supports a block size equal to or larger than the driver's block size, no changes are needed.
If the driver's block size is larger, the routine changes two parameters. First, it alters DOS's largest block size variable with the driver's larger value. Second, the routine works its way down DOS's block buffer list and "revectors" them to a new storage location. Care should be taken that a buffer is not allocated twice.
Tricks of the Trade
I should mention that the routine uses some of the features in Microsoft's MASM, Version 5.10. I make use of the USES directive in the procedure declaration. This directive PUSHes any registers listed after the word USES and POPs them at the end of the routine. USES is a handy directive for the mental defective (a group which includes myself) who are forever PUSHing and POPing in the wrong order or the wrong registers.
I also make use of the local label (@@) for short jumps. The syntax for a forward jump to a @@ label is @F. The syntax for a backward jump is @B. I generally use these labels for tight loops and comparison skips to differentiate them from labels referenced farther away in the program. This type of label should either not be used in macros, or only used in macros because subtle bugs could result. As an example, say a macro called DELAY has been coded as shown below:
DELAY macro jmp short @F @@: endm
This macro was used in the following code segment:
(0) cmp bl, 8 (1) jb @F (2) out 0a1h, al (3) delay (4) sub bl, 8 (5) @@: mov INT_NUMBER, bl
In this case, the comparison skip goes to line 4 rather than line 5 as you might expect.
Final Note
Several other undocumented MS-DOS system calls exist. Many of them provide pointers to the data structures shown in this article. All of them need only a debugger and a little patience to discover. In finding an application for the information uncovered, the reader is only constrained by imagination and the bouncer at the door.
Undocumented DOS by Rahner James
Copyright © 1989, Dr. Dobb's Journal[Listing One]
; *********************************************************************
; DATA STRUCTURES
; *********************************************************************
CALL52_2XX struc
dpb_ptr dd ? ; Far ptr to first disk parameter block
open_file_ptr dd ? ; Far ptr to Open File table
clock_ptr dd ? ; Far ptr to CLOCK$ device driver
con_ptr dd ? ; Far ptr to CON: device driver
num_drives_2xx db ? ; Number of logical drives
sector_sz_2xx dw ? ; Maximum sector size
buffer_ptr_2xx dd ? ; Far pointer to next sector buffer
NUL_driver_2xx db ? ; Beginning of the NUL device driver
CALL52_2XX ends
CALL52_3XX struc
dd 4 dup(?) ; This are defined in previous structure
sector_sz_3xx dw ? ; Maximum sector size
buffer_ptr_3xx dd ? ; Far pointer to next sector buffer
path_ptr_3xx dd ? ; Far pointer to drive path table
FCB_ptr_3xx dd ? ; Far pointer to FCB table
FCB_sz_3xx dw ? ; Size of the FCB table
num_drives_3xx db ? ; Number of logical drives
lastdrive db ? ; LASTDRIVE number
NUL_driver_3xx db ? ; Beginning of the NUL device driver
CALL52_3XX ends
; *********************************************************************
; MODIFY_DOS
;
; Changes MS-DOS's internal maximum sector size and buffer linked list
; to allow MS-DOS to access partitions above 32-megabytes
;
; Given:
; BIGGEST_SECTOR set to largest logical sector of all partitions
; DOS_VERSION = major version of MS-DOS
; OLD_FILE_END -> initial end of driver
;
; Returns:
; TRASH_PTR set to new end of program
; MS-DOS's internal maximum sector setting changed
; MS-DOS's internal linked buffer list relocated
; *********************************************************************
modify_dos proc uses ax ds di si
local sect_para_size:word
local buffer_paragraph:word
mov ax, offset old_file_end ; Get pointer to old end of file
add ax, 15 ; Change it to a segment number so we
mov cl, 4 ; only have to deal with a 16-bit number
shr ax, cl
mov bx, cs
add ax, bx ; AX -> paragraph start
mov buffer_paragraph, ax ; Save it for later
mov ax, biggest_sector ; AX = size of buffer we may need to
; allocate
add ax, 10h ; The buffer header too
shr ax, cl ; Change bytes/sector to paras/sector
mov sect_para_size, ax ; Save for later
mov ah, 52h ; Our undocumented system call
int 21h ; Set ES:BX -> DOS list of lists
mov di, buffer_paragraph ; DI -> beg. of initialization trash
mov si, sector_sz_2xx ; SI = offset to version 2.xx sector size
cmp dos_version, 2 ; See if version 2.xx
je @F ; Skip if it is
mov si, sector_sz_3xx ; SI = offset to version 3.xx sector size
@@: mov cx, es:[bx+si] ; CX = the largest block size supported
cmp cx, biggest_sector ; See if we need to change anything
jnc done ; Jump if we don't need to
mov cx, biggest_sector ; Update DOS with the new largest
add bx, si ; BX -> sector size variable
mov es:[bx], cx ; Update maximum sector size w/our sector
add bx, 2 ; Buffer pointer is right after for
; both versions
mov dx, cs
@@: mov si, bx ; DS:SI = ES:BX
mov ax, es
mov ds, ax
les bx, dword ptr es:[bx] ; ES:BX -> next buffer in list
cmp bx, -1 ; Are we at the end
je done ; Quit if we are
mov byte ptr es:[bx+4], -1 ; Set the nasty buffer flag
mov ax, bx ; Calculate absolute memory placement
mov cl, 4
shr ax, cl
mov cx, es
add ax, cx
cmp ax, dx ; See if we are above or below the line
ja @B ; Loop again if already done this buffer
les bx, dword ptr es:[bx]
xchg si, bx ; xchg es:bx, ds:si
push es
mov ax, ds
mov es, ax
pop ds
mov word ptr es:[bx], 0 ; Previous buffer -> this one
mov es:[bx+2], di
push es
mov es, di
mov es:[0], si ; This one -> what previous did
mov es:[2], ds
mov byte ptr es:[4], -1 ; Tell DOS this hasn't been used yet
pop es
add di, sect_para_size ; DI -> next buffer start
cmp si, -1 ; See if we are at end of list
je done ; Quit if we are
mov bx, si ; ES:BX -> next buffer in line
mov ax, ds
mov es, ax
jmp @B
done: mov cs:trash_pointer, di ; New end of program
ret
modify_dos endp