Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Porting Unix to the 386 the Initial Root Filesystem


MAY91: PORTING UNIX TO THE 386 THE INITIAL ROOT FILESYSTEM

Bill was the principal developer of 2.8 BSD and 2.9BSD and was the chief architect of National Semiconductor's GENIX project. Lynne established TeleMuse, a market research firm specializing in the telecommunications and electronics industry. They can be contacted via e-mail at [email protected] (c) 1991 TeleMuse.


In previous installments of this project, we've tantalized you with the preliminaries to our porting project. We've discussed our initial plan of the port, a bootstrap of the system off MS-DOS, the standalone utilities that help us test out the basic protected mode mechanisms of the 386, and cross-tools for generating the BSD utility programs that will run off our BSD operating systems kernel. In our analogy of climbing K2, most of our equipment has checked out in use, and the route along the ridge to the peak looks clear with possible good weather. In fact, at the end of this installment, we will finally complete the preliminaries, and leave our base camp to start the major ascent.

We now examine the initial root file-system required for our 386BSD operating system kernel. Earlier in this series, we discussed the cross-tools used to create 386BSD utilities, but we did not mention how we got these utilities onto our target machine. We could load them as files onto MS-DOS -- unfortunately, 386BSD has no ability (initially) to decipher the organization of files on the disk. (Some programmers who have spent time with the FAT, clusters, and their ilk might consider this to be more of a blessing than a curse.) Again, keep in mind that the primary operating system focus in this port is UNIX and not MS-DOS, and that we are working on a research project, not a commercial release.

We now embark on making a usable filesystem in order to hold the programs and files used by our newly ported system. The filesystem is a special data structure and functions that describe the storage of files on some means of bulk storage. It literally is a subsystem for reading, writing, creating, and destroying programs and data files on a media. Some programs and data files will need to be used by our operating system kernel immediately when it begins to run; the rest will be made accessible as the system is configured for use by the configuration programs, which will be run only after the system is completely underway. The first group of files will contain the programs that allow us to add (or mount) new filesystems, creating hierarchical or "tree-based" filesystems. Because trees grow from their roots, this filesystem will be known as the "root," or bottom-most of the filesystems.

The kernel is the "heart" of UNIX, running programs inside of processes that it creates for that purpose, and satisfying program requests (system calls) as needed. Later, when we describe the formulation of the kernel operating systems program (hereafter called the "kernel") and its initialization, we will use this initial root filesystem. Thus, in starting our major ascent, we will begin the actual job of porting the kernel program.

The Role of the Root Filesystem

The utility programs on the root establish a primary environment to craft an arrangement of filesystems, introduce special systems functionality via the server processes (daemons), and configure devices for operation. In addition, the root filesystem possesses the utility tools used to fix, or reload if necessary, other filesystems. These tools are often used to fix the root itself if it is not badly damaged. By virtue of its small size and lack of actively modified files, the root usually survives intact when a system crash occurs. This is all the better for us because we need it to run the system and summarily fix all ills. Should it get destroyed, however, we must completely reload it by some means; that's why some systems have "back up" root filesystems--just in case this actually happens. (With 386BSD, we eventually allow for root filesystem recovery and installation of the first root filesystem by means of a floppy root filesystem, which contains the tools to load the entire system over the network, via a serial port, or from a floppy or cartridge tape dump.)

The root filesystem is a small but essential portion of disk storage. It provides enough functionality for the system to expand its resources to use storage other than the root itself, and configure operations based on arrangements mandated by current conditions. The root filesystem is also the starting point for all filename translations and path searches. As a result, a smaller root with fewer files to search through will generally improve file operations performance.

A Brief Review of the Root

We will now briefly review the organization and location of various files and their responsibilities in the UNIX tree-structured filesystem. This is in no way intended to replace the more authoritative descriptions of the UNIX file tree (see The Design and Implementation of the 4.3BSD UNIX Operating System, by Leffler, et al., Addison-Wesley, 1989 for more information on this topic) but will outline what needs to be present in the root for minimal operation with our 386BSD system.

In Example 1, the root directory, we can see the base of the root filesystem containing all of the top-level directories and files in our 386BSD system. This listing, generated by the UNIX ls command (ls -l), shows file attributes, link count, ownership, file size, modification date, and filename. Three kinds of files are present here (as indicated by the first character of attributes): directories (d), symbolic links (l), and regular files (-). Files in the root serve the functions of installation, booting, system initialization, device configuration, basic utilities, system operations, and so on.

Example 1: The root directory generated by the UNIX ls command (ls -l), shows file attributes, link count, ownership, file size, modification date, and file name.

  drwxr-xr-x    2 root      1536 Feb 3 10:18 bin/
  -rwxr-xr-x    1 root     20480 Sep 4 21:02 boot*
  drwxr-xr-x    2 root      1024 Feb 22 13:32 dev/
  drwxr-xr-x    2 root      1536 Mar 5 18:31 etc/
  drwxr-xr-x    2 root       512 Dec 7 12:41 lib/
  drwxr-xr-x    2 root      4096 Dec 7 12:41 lost+found/
  drwxr-xr-x    2 root       512 Aug 16 1990 mnt/
  drwxr-xr-x    2 root       512 Dec 6 12:11 root/
  drwxr-xr-x    2 root      1024 Dec 8 12:45 sbin/
  drwxr-xr-x    2 root       512 Sep 19 09:18 stand/
  lrwxr-xr-x    1 root        12 Jun 4 1990 sys@ --> /usr/src/sys
  drwxrwxrwx    2 root       512 Mar 5 18:31 tmp/
  drwxr-xr-x    2 root       512 Jan 26 22:12 usr/
  drwxr-xr-x    2 root       512 Jan 27 23:12 var/
  -rwxr-xr-x    1 root    319488 Feb 22 08:57 vmunix*

Installation: /stand

The /stand directory in our BSD root filesystem contains standalone programs to be loaded using the standalone /boot program and run directly on the machine, sans the presence (and possible interference) of the operating system. This permits us to run programs to test, format, or diagnose device behavior. Other programs (/stand/cat, /stand/ls, /stand/icheck) allow us to diagnose problems with the root filesystem, independent of the operating system. In addition, standalone disk bootstrap programs (/stand/bootwd, /stand/bootfd) reside here, to be installed by other programs (/sbin/disklabel) onto the disk media.

Booting: /boot and /vmunix

Two other standalone programs are worthy of mention here: /boot, the universal bootstrap used to load the system off any media, and /vmunix, the operating system kernel proper. According to our early porting plan, we actually use these files last, because we load our system off of MS-DOS instead of from the BSD root filesystem. For those unfamiliar with UNIX, the only use of the operating system's executable file after bootup is to provide information on symbolic references inside of the kernel, which are run to a very few nonessential programs. In other words, although our system would continue to run if this file were overwritten or deleted, it probably would not boot in these cases.

Initialization: /sbin/init, /dev/console, and /bin/sh

When the system starts operation, it first executes the program in the file /sbin/ init, which initializes the system and prepares it for operation. The system, as it starts, is completely mute otherwise. In the minimal case, the system is started "single-user" -- init manages to configure the system to execute commands from the console device (on a PC, the keyboard, and display). This resembles the command mode that MS-DOS systems provide on a standard boot-to-command interpreter. In this case, init opens the console (/dev/console) and executes the command interpreter or shell (/bin/sh) . Thus, the minimum files we need (in addition to booting mentioned earlier) are /sbin/init, /dev/console, and /bin/sh. If any of these are lacking or damaged, UNIX cannot run, and we will not get a prompt from the command interpreter. Of course, in order to do something useful, we'll also need the fides that correspond to commands presently used to run the aforementioned commands from the interpreter. This is the minimum required to get our kernel running.

Although many PCs frequently run UNIX with a sole user, /sbin/init can also prepare the system for multiuser or multitasking operation. In this case, init runs the command interpreter on a file of commands (/etc/rc) commonly referred to as a "shell script." This in turn calls other shell scripts for network, device, and server process invocation.

Server processes, which provide for a variety of services available with UNIX systems are often referred to as "daemons," as they attempt to do work invisibly. This is a play on Maxwell's daemon, who would merrily put hot molecules in one box and cool molecules in another box, thus (?) violating the Second Law of Thermodynamics. (The proof fails when the daemon acquires so much energy in rapid collisions from highly vibrating molecules that it must radiate the energy as heat, thus perturbing the system. See Feynman's Lectures on Physics, Volume 1, for more information. You just can't get something for nothing, can you?)

Among other things, the system may now perform housekeeping functions: fixing any broken filesystems it can, erasing temporary files and other garbage, adding filesystems (both on the computer and over the network), and connecting the system into the world.

Traditionally, all these commands provide little output as they are launched --which can occasionally confuse more than reassure. One popular computer author, unfamiliar with UNIX, complained of feeling quite uncomfortable when a UNIX workstation flashed him the message, "starting standard daemons." Perhaps he thought he needed the help of a system exorcist!

In multiuser operation, the system depends on all the functionality that has been configured, including indications of service availability through the appearance of a login prompt. We find it amusing when our 386 PC laptop prompts us for a login account name and password, as if we are competing with hundreds of users for access to the machine! On the other hand, our little 386 laptop, running 386BSD, has about as much disk space and memory and is three times the speed of the PDP 11/70 that the University of California used to run 50 to 70 students at a clip. UNIX regards little PCs and systems with hundreds of terminals in the same way -- a login prompt per customer. Configuration: /dev and /etc

Hardware devices on UNIX are accessed through special filenames in /dev. For our filesystem to work correctly, we must have the appropriate device files already made. Otherwise, the utility programs will not be able to access the devices, even if the operating system has drivers that work with the underlying hardware. These files are special because they are made with a special program (/sbin/mknod) which creates an association between the file and a software driver in the kernel. A shorthand script program (/dev/MAKE-DEV) provides a way to make these files symbolically. With 386BSD, we must make the console (/dev/console) and the root filesystem's device (/dev/wdOa) before we run; it's wise to make other special files at this time, too.

Besides configuring device filenames, we need to specify device configuration for disk drives (/etc/fstab), terminal lines (/etc/ttys), and printers (/etc/printcap) to describe device characteristics. One criticism of all UNIX systems has been the need to wade through a plethora of ad hoc configuration files for device and program use. Most of the system configuration files in this project, however, can be found within the /etc directory.

Utilities: /bin and /sbin

The basic utilities needed for operation of UNIX are found in the two directories: /bin and /sbin. /sbin contains supervisory commands not generally useful to ordinary users, but important for system operation and system management. /bin contains basic commands useful to all UNIX users -- kind of a core group. Both of these directories are kept short and small to minimize the size of the root and the time it takes to search for a command. All other commands (hundreds, usually) are found in the additional filesystems that become active when UNIX is brought up multiuser. To this end, it is important to note that /sbin/mount and /sbin/umount are used to mount and unmount those additional filesystems.

Operation: /tmp and /var

Once in operation, the /tmp directory is used to store temporary files from editors, formatters, compilers, and assemblers, as needed. /var is a directory that holds various short-term data, such as usage accounting, security logs, incoming electronic mail, crash dumps, printer spooling, and runtime program databases. Frequently, these two directories are separately mounted filesystems, especially on systems where these kinds of files take up much space.

Other Directories: /lib, /mnt, /usr, /root, and /sys

Finally, we have a group of files that don't fit any of the above categories. /lib contains object libraries and runtime start-off routines to allow C and other languages to run on the system. /usr is an empty directory used as a mount point to attach a much larger filesystem to -- one that contains everything else not on the root in the way of utilities, object libraries, include files, documentation, and system source. /mnt, also an empty directory, is used as a mount point for optional filesystems to be attached to when needed. /root contains the home directory for the superuser account (root), keeping it separate from the actual root directory of the system. /sys is our sole example here of a symbolic link -- a file type that provides a shortcut within the filesystem to another location in the filesystem tree. In this case, /sys hides a reference to /usr/src/sys, so when the filesystem associated with /usr is mounted, a reference to a file like "/sys/i386/i386/locore.s" is satisfied with a reference to the file "/usr/src/sys/i386/i386/locore.s".

Filesystem Creation

Normally, we would use our ported system to create root filesystems, but we again run into a "chicken-and-egg" problem, because we need a finished system to create the archetypal root filesystem from which we make all others. So, in the typical "break the egg" and "cook the chicken" way we resolve all minor paradoxes, we make the first filesystem on our cross-host by special means. We either find a cross-host with identical key data structure characteristics (byte order, structure field alignment, and structure packing) or write a transformation program to turn our crosshost's filesystem format (via stretching, swapping, and shrinking) into a 386BSD-compatible form. The result is a file of bytes that contains an image of what the filesystem should contain on the PC's disk drive.

If we were starting this project now, we might consider a novel alternative method using the BSD NFS (Network FileSystem) code. We would then run our 386BSD kernel in a "diskless" fashion, passing all file operations over the network to be satisfied by an NFS server host. We could use any NFS server to provide access to our initial root filesystem. Oddly enough, this would hide not only the cross-host's filesystem format, but the cross-hosts operating system as well. Conceivably, one could even use a non-UNIX cross-host. All of this is made possible by NFS's file abstraction mechanism, which converts filesystem data to a common external representation via its internal XDR (eXternal Data Representation) library.

Filesystem Downloading

With our filesystem image in a file, we can download it using either Kermit, NCSA Telnet, or some other file transfer utility that can copy a binary image from our cross-host to the PC under MS-DOS. In the early stages, before the kernel successfully ran processes, small filesystems of a few hundred Kbytes (principly /dev/console, /sbin/init, /bin/ sh, and /bin/ls) could be downloaded as needed over the serial ports using Kermit. As success with the kernel increased, so did the size of the root filesystem, because the focus of the project moved from minimal operation to proving the kernel by means of increasingly larger utilities. This affected us in three ways: Serial link downloading took too long; our MS-DOS partition limited the size of the filesystem across which we could copy; and even a single byte change in a single file required a complete filesystem download to affect modification.

Having downloaded the filesystem, we used the copyfs program (see "Initial Utilities: Three PC Utilities" in DDJ, February 1991) to install it in a partition on the hard disk, separate from MS-DOS. The BSD kernel disk driver was also modified to relocate what it considered the beginning of the disk to this point so we could share the disk with two systems. Copyfs would place the image of the filesystem onto the absolute disk storage blocks without any translation, making the image real.

Filesystem Debugging

At this stage, it is considered good practice to check the filesystem on the PC. We used the standalone utilities (/boot, /stand/cat, /stand/ls, /stand/icheck) to verify that the filesystem was correct for use with the kernel. However, even before having an operational system, we can validate our filesystem with our standalone system (see "Initial Utilities: The Standalone System," DDJ, March 1991), because it has the ability to interpret the filesystem data structures. /boot can be used to check for the presence of files and directories by attempting to boot from a file. For example, one can try to boot from /stand/ls, with the proviso that "/" be a directory that has the "stand" directory in it, and that "stand", in turn, contain "ls"--an executable file. If the given file cannot be opened, /boot will tell us why. ls, like its user-mode utility counterpart, lists the contents of a directory on a disk, so we can check to see if the contents are correct. Similarly, cat shows the contents of an ASCII file, so we can check to see that the ASCII files present have the appropriate contents and that fence-post or data translation problems have not corrupted the files. Finally, /stand/ icheck, the largest standalone program, can exhaustively check for filesystem consistency to make certain that all of the filesystems' data structures are undamaged. We can verify this by running the same icheck program on the cross-host, ensuring that the filesystem is identically consistent on both the cross-host and the target system.

These validation techniques independently test file contents separate from file system data structures, or "meta data," on the off-chance that we are somehow corrupting the contents of files when we create the filesystem. It's important to realize that programs that check the filesystem have no way to check contents of files. Thus, the file contents may be completely mangled in ways that could still leave the filesystem in a correct state!

What's in a Filesystem?

As we stated earlier in this article, a filesystem is a data structure designed to implement the abstraction of files and directories. As such, there are dozens of types of filesystems possible. Berkeley UNIX currently offers three flavors of filesystems: UFS, NFS, and MFS.

  • UFS, like many other filesystems, manages to impress its underlying files and directories on a bulk storage media such as magnetic moving head disks. In particular, UFS uses placement algorithms to schedule head movement and rotational delay to improve average filesystem effectiveness.
  • NFS, the Network Filesystem originally designed by Sun Microsystems, funnels program requests for files over a network connection, which is then satisfied by a server machine's own filesystems. Consequently, these files can be located quite a distance away from the actual computer whose program is referencing a file.
  • MFS, a memory-based filesystem, stores temporary files in the processor's virtual memory storage areas for rapid access to transient data. It evolved from RAM-based disks used on many MS-DOS systems and uses virtual memory to provide a way to keep active files present in RAM while gradually moving inactive portions back to the disk.

Why Do We Need a Root Filesystem?

Traditionally, the UNIX filesystem is used to hold the operating system and its bootstrap as ordinary files. This makes it convenient to create and install new versions of the operating system with the very same tools used to develop ordinary user programs. This arrangement also makes it possible to choose alternative versions of the operating system, and to run newer systems under development, or fall back to back-up versions if for some reason the default system is damaged and unusable. This flexibility presents a problem--how do you load the operating system which makes use of the filesystem if it's already in the filesystem itself?

As part of the bootstrap process, the computer loads bootstrap programs with an ever-increasing ability to manipulate the hardware and access files from the UNIX filesystem. In 386BSD, the ROM BIOS starts the process by reading the first block of disk storage off the disk, and then executes its contents as an ordinary program. This tiny program has the sole responsibility of reading in a program 15 times its size and located on the next successive blocks on the disk drive. In turn, this larger program has the responsibility of deciphering the UNIX filesystem located adjacent to it on this disk drive, and extracting the next bootstrap program from the file "/boot" in the filesystem. This final bootstrap program can be arbitrarily large (bounded by physical memory) and can load programs from all possible devices on the computer. This bootstrap can also determine which device to load the operating system from, the configuration of the processor prior to boot, and power-fail or crash-recovery steps. It can also decide whether the system should automatically reboot itself or pause and await manual intervention to remove an obstacle inhibiting automatic reboot; it can be interrupted by an operator if he wishes to change his mind and insist on alternative actions. Thus, the bootstrap can be used to load other standalone programs that might be used for disk formatting, recovery, or installation, as well as loading the operating system itself (the file /vmunix). In a sense, when the bootstrap is loading, you might call the filesystem it is using the "boot filesystem"!

A similar chicken-and-egg problem occurs when we decide to run the initialization process (/sbin/init) to initialize the subsequent user program operation of the system. Because UNIX systems only know how to execute a program from a file in a filesystem, we need a filesystem from which to execute files. Thus, the root filesystem is the first filesystem accessible, via a kind of "virgin birth." All other filesystems will be explicitly attached to it via the UNIX mount command, which tapes the base (or root) of the filesystem to be mounted onto an existing directory in the root (the "mount" point).

Non-UNIX systems have an entirely different perspective regarding bootstrapping. Usually, the given system is kept on a special, dedicated location on the disk, frequently adjacent to bootstrap code. Sometimes, the equivalent of the UNIX /sbin/init program is also found in this special location. Therefore, these programs require special installation onto the disk, and the system does not require the concept of a "root" filesystem, because it does not require a filesystem to become active.

Note also that we have one file-naming convention in UNIX, so that even devices are named just like ordinary UNIX files (/dev/wd0a or /dev/console, for example). This is different from MS-DOS or VMS, where two namespaces are present at any time: the device namespace (A: or DKOA:) and the file pathname (\foo\bar\bletch or [foo] bar,bletch;2). With UNIX, the filesystem is a central concept, along with the global way in which it is used and reused to provide a sole namespace for file objects. In a sense, the originators of UNIX felt this concept to be so important, that in follow-on-work (such as Plan 9, see DDJ January 1991), the filesystem is even more central to the system, by becoming a way of expressing interprocessor, window system, and program communications metaphors.

The Filesystem Metaphor and its Importance in Future Work

With all modern systems, we now use the filesystem metaphor underlying the basic syntax and semantics of the UNIX filesystem. As a result, the same file specification syntax known to all UNIX applications programs can be used to transparently access files embedded in archival storage systems, remotely manipulate files on remote systems of entirely heterogeneous design, store files on fail-safe redundant media, or a combination of these. We could even design a database filesystem where the filename directory path would describe a database query, with the "leaf" files themselves being the database records. The foresight of the originators of the early hierarchical filesystems (and the Multics Project) is now apparent, as these ideas come to fruition in a variety of research and commercial applications. As we continue to struggle with the complexity of our software systems, the use of powerful metaphors that unify many mechanisms within one becomes increasingly critical to the design and implementation of any complex system.


Copyright © 1991, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.