Channels ▼
RSS

Parallel

Ram Disk Driver for Unix

Source Code Accompanies This Article. Download It Now.


OCT90: RAM DISK DRIVER FOR UNIX

Reduce overhead and improve performance

Jeff is a special projects engineer for Banyan Systems and can be reached at 28 Grant Street, Milford, MA 01757.


Disk operations are generally slow and expensive compared to other operating system functions. In Unix, performance is boosted by caching the most recently used disk blocks within a pool of RAM buffers.

This pool, called the "buffer cache," utilizes cache hits to complete disk requests without having to access the disk. Unfortunately, the buffer cache is static in size, and eventually some buffers must be written out to disk before they can be reused for another disk block.

In some respects, the Unix buffer cache is a RAM disk. The difference is that the buffer cache manages the reuse of disk blocks, while the RAM disk simply reports the file system as full when all buffers have been allocated.

This article describes the implementation of a RAM disk driver for Unix. A 386 system with four megabytes of RAM running Unix System V/386 Release 3.2 was used throughout to develop the driver.

The Driver Implementation

The RAM disk driver differs from traditional Unix disk drivers in several ways. The first, of course, is the hardware. In reading or writing to a physical disk, the driver must first position the read/ write head, then transfer the data blocks. When all is done, typically many milliseconds later, the hard disk presents an interrupt to Unix and the request can be completed. In contrast, RAM disks do not experience the positioning delays associated with mechanical hardware, which eliminates the need for interrupts.

The RAM disk driver developed for this article (see Listing One, page 100) supports both a block-mode and a character-mode interface. Block mode is used by the kernel to read and write mounted Unix file systems. Character mode provides a raw interface to the disk, allowing the application to bypass the Unix buffer cache and manipulate the disk directly. Eight entry points to the driver are provided: rdinit, rdopen, rdclose, rdstrategy, rdread, rdwrite, rdintr, rdioctl, and rdprint.

During the Unix boot, the kernel calls rdinit, which determines whether a RAM disk has been defined, and, if it has, allocates the memory required to represent the disk. The sptalloc function is used to allocate this memory. Using sptalloc, memory is allocated in page-size units (4096 bytes in 386 Unix), and if successful at obtaining the number of pages requested, sptalloc returns a kernel virtual address referencing this memory. All page table manipulation is handled by sptalloc, including page linkage and the locking down of pages as directed by the PG_P parameter passed to sptalloc. Notice the DONT_SLEEP parameter in the sptalloc call. Initialization routines in Unix are not permitted to sleep because they can prevent the system from successfully booting.

After space has been carved out for the RAM disk, rdinit fills in rd_cfg, the configuration structure which represents the state, virtual address, and size of the RAM disk. Of these, only the size field must be defined during driver compile time.

The driver supports three states: RD_UNDEFINED, RD_CONFIGURED, and RD_OPEN. RD_UNDEFINED is the initial state. It indicates the RAM disk has not been set up. RD_CONFIGURED is set by rdinit after a successful call to sptalloc. RD_OPEN is logically ORed to the state field each time the disk is opened. RD_OPEN could be used at a later time to implement a critical region lock for the RAM disk, allowing only one open at a time.

rdopen is either called when a mount is being performed or when an application attempts an open on the device node associated with the RAM disk. Application programs use the 8 bits of the minor number to inform the driver which disk should service the request. In this implementation, rdopen recognizes only one RAM disk, and rejects the request if it specifies any disk other than zero. Rejected requests are handled by updating the error field in the current u area (defined in "sys/user.h") and returning to the caller. An error is returned in the same fashion if the state maintained by the configuration structure rd_cfg indicates the disk has not been initialized. With request verification out of the way, rdopen turns on the RD_OPEN state bit.

rdclose is simple since there is no mechanical hardware associated with a RAM disk. rdclose simply turns off the RD_OPEN state bit. You may think this is an error because we are turning off the open bit even though multiple opens may have been performed on the device. Not to worry; if a file has been opened multiple times in Unix and a close request is then made by one instance, Unix simply decrements in internal file open count. When that count reaches zero, the close function is called.

rdstrategy is the driver function that actually services the read/write requests. A Unix buffer header (described in sys/buf.h) is passed as a parameter containing all necessary information to service the request. Pertinent information such as the requested block, number of bytes to transfer, direction of transfer, and the target device are filled in by the kernel prior to calling rdstrategy.

Rather than trusting all information presented by the buffer header, rdstrategy performs some simple error checking; without error checking, a simple request exceeding its limits could crash the system. rdstrategy begins by converting the requested block number into a byte offset which is used to reference the start of the request. The number of bytes requested is then added to this start location to determine if the request will go off the end of the disk. If the operation is a read request, this behavior is tolerated so end of file (EOF) can be detected by the application. The request is adjusted so all data up to and including the last block is read. However, write requests are rejected immediately.

Now that the error checking is out of the way, rdstrategy can transfer the data into or out of the RAM disk. Since this is nothing more than a copy operation, bcopy, which is supplied by the kernel, can be used. bcopy is used to copy a specified number of bytes from one location of kernel memory to another.

Upon completion of the copy, the residual count and error fields of the buffer header are updated to indicate the request has been serviced successfully. iodone is then called to return the buffer to the process responsible for initiating the request.

An application or system administration utility references the character interface through rdread and rdwrite. These routines are virtually identical except for the direction of the data transfer.

Unix provides two support routines, physck and physio, which are used to transform the request into a buffer header suitable for rdstrategy. In a manner similar to rdstrategy, physck verifies that the submitted request is within bounds of the RAM disk. If the request is a write and it exceeds the bounds of the disk, physck returns an error. However, in order to handle EOF correctly, read operations must be trimmed back and the transfer count adjusted so the request will not exceed the last valid disk block. The physio function performs all housekeeping chores such as extracting information from the user area to build a buffer header, preventing user buffers from being paged out, and finally calling rdstrategy to get the request serviced.

The ioctl Unix interface is a repository for miscellaneous driver functions. The RAM disk driver supports only one ioctl function, which returns information about the size of the specified RAM disk. The size information is taken out of the rd_cfg structure and returned to the process initiating the request. The buffer location passed into rdioctl is not in the same memory map as the rd_cfg structure. To get around this problem, Unix provides the copyout function. copyout allows a driver to copy data out of kernel space and into data structures residing in user space. Example 1 illustrates how an application program would query the RAM disk about its size.

Example 1: How an application queries the RAM disk's size

  #include "sys/types.h"

  main ( )
  {
     int fd;
     struct rd_size {
        daddr_t sector_count;
        long    b_count;
  } ram_disk_size;

  if ( (fd = open ("/dev/rdsk/rd", O_RDONLY)) < 0)
  {
  printf ("Could not open RAM disk to do ioctl.\n");
  exit (1);
  }
  if ( ioctl (fd, RD_GETSIZE, &ram_disk_size) < 0)
  {
     printf ("Could not determine size of RAM disk.\n");
     exit (2);
  }
  printf ("The RAM disk consists of %d sectors occupying %d bytes.\n",
     ram_disk_size.sector_count, ram_disk_size.b_count);
  }

An interrupt handler is not required for the RAM disk, but one is supplied (rdintr) just in case some spurious interrupt is dispatched to the RAM disk. In that case, a warning message reporting the spurious interrupt is sent to the console via cmn_err.

Information or messages that refer to the RAM disk are dealt with by rdprint. For example, if the RAM disk has no free blocks left to allocate and a request for RAM disk blocks arrives at the file system, it calls rdprint, passing along a text string stating that the RAM disk is out of space.

Adding the RAM Disk to Unix

After the driver has been compiled, copy the object file to /etc/conf/pack.d/ rd/Driver.o. The kernel must now be rebuilt in order to pull in support for the RAM disk driver. There are numerous ways to rebuild a kernel and complete necessary installation procedures. The method described here utilizes the Installable Driver Package (IDP), which is used on most Unix System V/386 Release 3.2 systems. Throughout the build process, rd is the prefix used to uniquely identify the RAM disk driver.

To begin, a system device file must be created that describes what hardware resources the RAM disk requires. Example 2(a) shows the file used for this article.

Example 2: Examples for developing Unix RAM disk

  Example 2(a):

       rd     Y     1     0     0     0     0     0     0     0

  Example 2(b):

       rd     ocrwiI      icbo        rd    0     0     1      2     -1

  Example 2(c):

       /etc/conf/bin/idinstall -a -m -k rd

  Example 2(d):

       rd rdsk/rd0              c     0
       rd dsk/rd0               b     0

  Example 2(e):

       cd /
       # Make a filesystem on the RAM disk.
       /etc/mkfs /dev/dsk/rd0 2048:150
       /etc/mountall /etc/fstab

Because hardware resources are not required for this driver, the interrupt level, interrupt vector, I/O address, and memory address fields all contain a zero. The first three fields tell the build process the name of the driver, confirm that the driver is to be linked into the kernel, and report the number of devices supported, respectively. Copy this file to /etc/conf/sdevice.d/rd prior to relinking the kernel.

Next, an entry must be added to the master device file that describes the driver interface. This is shown in Example 2(b).

The first field says this description applies to the RAM disk driver. Next, "ocrwil" indicates the RAM disk supports an open, close, read, write, ioctl, and init interface. The third field describes the driver as installable, capable of supporting character and block devices, and containing only one entry in the system device file described previously. The fourth field is the handler prefix used to distinguish the interface entry points. Fields five and six are for block and character major numbers. Setting these fields to zero lets the build process assign these numbers dynamically. Fields seven and eight specify the minimum and maximum number of units that can be declared in the system device file. The last field is for Direct Memory Access (DMA), and, since this device doesn't use any, this field is assigned -1.

I've found the easiest way to add this information to the master device file is to create a file in your local directory called "Master," containing the line in Example 2(b). Then use the idinstall command listed in Example 2(c) to append the description to the master device file.

The command line informs idinstall that the description in file Master is to be added to the master device file; upon completion, do not delete file Master.

After running idinstall, the major numbers assigned by idinstall can be extracted from the master device file in /etc/conf/cf.d/mdevice and used to confirm the major numbers assigned to the block and character device files by idmknode during the next system boot.

To have the boot process automatically generate the RAM disk device nodes, a node file is needed. Example 2(d) describes two nodes, one for the block and one for the character device.

Given the configuration in Example 2(d), the block device will be addressed as /dev/dsk/rd0 with minor number 0, while the character device is referenced as /dev/rdsk/rd0 using minor number 0. Modifying the driver to support multiple RAM disks would require additional entries in Example 2(d) as well as unique minor numbers. To complete the node setup, copy the file described in Example 2(d) to /etc/conf/ node.d/rd.

Before proceeding, it makes sense to make a backup copy of the kernel just in case the new driver has introduced a bug and prevents the system from booting. Then, if there are problems, the old kernel can be used.

At this point, all support files are in place and the kernel can be rebuilt by issuing /etc/conf/bin/idbuild. If all goes well (meaning all external references were resolved) the current kernel can be shut down and the new one brought up for driver testing.

Automating the Installation

RAM disks represent a form of volatile storage. Because everything written to the disk is lost during a system shutdown, the disk must be initialized every time the system is restarted. Rather than manually issuing the proper commands each time, a system administrator could automate the procedure by adding special commands to the Unix startup scripts.

To begin with, a mount point must be established so the user community can reference the RAM disk. To create /ramdisk as the mount point use the following mkdir command: mkdir/ramdisk.

The mount point needs to be created only once as long as you don't delete it after unmounting the file system. Next, append the following line to /etc/fstab: /dev/dsk/rd0 /ramdisk.

The fstab file contains all file systems to be mounted during the Unix boot. The last step involves building the file system on the RAM disk so the boot procedure can find something to mount. The file system can be created by adding a mkfs command to /etc/rc2.d/ S01MOUNTFSYS. Example 2(e) illustrates a modified S01MOUNTFSYS file.

It's important to note that the mkfs command is executed before directing /etc/mountall to read and mount all file systems listed in /etc/fstab. In Example 2(e), the mkfs command is instructed to build a file system on the RAM disk using 2048 sectors (each sector is 512 bytes in size) and allow a maximum allocation of 150 files. Overhead required by the file system is minimal. However, the actual number of inodes and filesystem blocks will be reduced to 144 and 2024, respectively.

Although the procedure above will automate the installation of the RAM disk, it should be noted that this is a static configuration. For example, reconfiguring the RAM disk to increase its size from one megabyte to two requires the mkfs command buried in /etc/rc2.d/S01MOUNTFSYS to be modified so the additional space provided can be utilized.

Improving the Performance of Unix

The RAM disk has several practical uses in Unix. For example, many programs rely on temporary files and tend to create them in /tmp. Instead of using part of the root file system for /tmp, try mounting the RAM disk on /tmp. This can be done by simply editing the /etc/ fstab and replacing the /ramdisk mount point with /tmp.

Another possibility is to reduce overhead associated with loading files. This is a matter of identifying popular files and copying them to the RAM disk. Make sure the PATH environment variables are updated so the RAM disk is searched for the file first.

If your system allows for more than one swap device, a significant performance gain can be had by making the RAM disk a primary swap device. The swap device is the area of disk used by the buffer cache when it becomes necessary to page out buffers. In a heavily loaded system, the RAM disk could keep the system from thrashing itself to death because the swap area no longer has to wait for a slow disk.

Conclusion

Writing a RAM disk driver is a great way to get your feet wet in Unix device drivers. The driver developed here supports only one RAM disk, one megabyte in size. It's a trivial exercise left to the reader to change the size of the RAM disk or extend the driver so it may support additional disks. This driver should port to other flavors of Unix with minimal effort.

_RAM DISK DRIVER FOR UNIX_ by Jeff Reagen

[LISTING ONE]

<a name="0200_000c">

/* The following is a RAM disk driver developed for Unix Sys V/386
 * release 3.2. -- Author: Jeff Reagen          05-02-90.
*/
#include "sys/types.h"
#include "sys/param.h"
#include "sys/immu.h"
#include "sys/fs/s5dir.h"
#include "sys/signal.h"
#include "sys/user.h"
#include "sys/errno.h"
#include "sys/cmn_err.h"
#include "sys/buf.h"

#define RD_SIZE_IN_PAGES 0x100L           /* 256 4K pages => 1 MB     */
#define RD_MAX           1                /* Max RAM Disks            */
#define RAMDISK(x)    (int)(x&0x0F)    /* Ram disk number from dev */
#define DONT_SLEEP    1                /* sptalloc parameter       */

/* For ioctl routines.
*/
#define RD_GETSIZE   1                 /* return size of RAM disk     */
struct rd_getsize {                       /* Structure passed to rdioctl */
   daddr_t sectors;
   long    in_bytes;
};

/* Valid states for the RAM disk driver.
*/
#define RD_UNDEFINED    0x0000           /* Disk has not been setup */
#define RD_CONFIGURED    0x0001           /* Configured disk */
#define RD_OPEN          0x0002           /* Indicates disk has been opened */

/*   The RAM disk is created iff the size field has been defined. Since
 *   sptalloc only allocates pages, make sure the size is
 *   some multiple of page size (4096).
*/
struct ram_config {
   int       state;     /* current state                */
   caddr_t   virt;      /* virtual address of RAM disk  */
   long      size;      /* RAM disk size in units of 4K */
};

struct ram_config rd_cfg = {RD_UNDEFINED, (caddr_t)0, RD_SIZE_IN_PAGES};

extern caddr_t sptalloc();

/* rdinit - initialize the RAM disk.
 */
rdinit (dev)
   dev_t   dev;
{
   /* Has a RAM disk been defined? */
   if (rd_cfg.size == 0)
   {
      /* Just return silently - ram disk is not configured. */
      return 0;
   }

   /* Last parameter 1 in sptalloc calls prevents sleep if no memory. */
   if ((rd_cfg.virt = sptalloc (rd_cfg.size, PG_P,0,DONT_SLEEP)) == NULL)
   {
      cmn_err (CE_WARN,"Could not allocate enough memory for RAM disk.\n");
      return 0;
   }
   rd_cfg.state |= RD_CONFIGURED;

   return;
}

/*  rdopen
 */
rdopen (dev)
   dev_t dev;
{
   int rdisk;

   rdisk = RAMDISK(dev);

   if ( rdisk >= RD_MAX)
   {
      /* RAM disk specified foes not exist. */
      u.u_error = ENODEV;
      return;
   }

   /* Make sure ram disk has been configured. */
   if ( (rd_cfg.state & RD_CONFIGURED) != RD_CONFIGURED)
   {
      /* disk has not been configured! */
      u.u_error = ENOMEM;
      return;
   }

   /* RAM disk successfully opened. */
   rd_cfg.state |= RD_OPEN;
}

/*  rdclose - close the RAM disk.
 */
rdclose (dev)
   dev_t   dev;
{
   rd_cfg.state &= ~RD_OPEN;
   return;
}

/*  rdstrategy - the entire synchronous transfer operation happens here.
 */
rdstrategy (bp)
   register struct buf *bp;
{
   register long  req_start;     /* start of transfer */
   register long  byte_size;     /* Max capacity of RAM disk in bytes. */
            int     disk;             /* RAM disk being requested for service. */

   disk = RAMDISK(bp->b_dev);

   /* Validate disk number. */
   if (disk >= RD_MAX)
   {
         /* Disk does not exist. */
         bp->b_flags |= B_ERROR;
         bp->b_error = ENODEV;
         iodone(bp);
         return;
   }

   /* Validate request range. Reads can be trimmed back... */
   byte_size = rd_cfg.size * NBPP;
   req_start = bp->b_blkno * NBPSCTR;
   bp->b_resid = 0;            /* Number of bytes remaining after transfer */

   /* Check for requests exceeding the upper bound of the disk. */
   if (req_start + bp->b_bcount > byte_size)
   {
      if (bp->b_flags & B_READ)
      {
         /* Read */
         /* Adjust residual count. */
         bp->b_resid = req_start +  bp->b_bcount - byte_size;
         bp->b_bcount = byte_size - req_start;
      }
      else
      {
         /* Write - always fails */
         bp->b_resid = bp->b_bcount;
         bp->b_flags |= B_ERROR;
         iodone (bp);
         return;
      }
   }

   /* Service the request. */
   if (bp->b_flags & B_READ)
   {
      bcopy (rd_cfg.virt + req_start, bp->b_un.b_addr, bp->b_bcount);
   }
   else
   {
      bcopy (bp->b_un.b_addr, rd_cfg.virt + req_start, bp->b_bcount);
   }
   bp->b_flags &= ~B_ERROR;    /* Make sure an error is NOT reported. */
   iodone(bp);
   return;
}

/*  rdread - character read interface.
*/
rdread (dev)
   dev_t   dev;
{
   /* Validate request based on number of 512 bytes sectors supported. */
   if (physck ((daddr_t)rd_cfg.size << DPPSHFT, B_READ))
   {
      /* Have physio allocate the buffer header, then call rdstrategy. */
      physio (rdstrategy, (struct buf *)NULL, dev, B_READ);
   }
}

/*  rdwrite - character write interface.
*/
rdwrite (dev)
   dev_t   dev;
{
   /* Validate request based on number of 512 bytes sectors supported. */
   if (physck ((daddr_t)rd_cfg.size << DPPSHFT, B_WRITE))
   {
      /* Have physio allocate the buffer header, then call rdstrategy. */
      physio (rdstrategy, (struct buf *)NULL, dev, B_WRITE);
   }
}

/*   rdioctl - returns size of RAM disk.
 */
rdioctl (dev, command, arg, mode)
   dev_t   dev;
   int     command;
   int     *arg;
   int     mode;
{
   struct  rd_getsize sizes;

   if ( RAMDISK(dev) > RD_MAX || !(rd_cfg.state&RD_CONFIGURED) )
   {
      u.u_error = ENODEV;
      return;
   }

   switch (command) {
      case RD_GETSIZE:
         sizes.sectors = rd_cfg.size << DPPSHFT;
         sizes.in_bytes = rd_cfg.size * NBPP;
         /* Now transfer the request to user space */
         if (copyout (&sizes, arg, sizeof (sizes)) )
         {
            u.u_error = EFAULT;
         }
         break;

      default:
         /* Error - do not recognize command submitted. */
         u.u_error = EINVAL;
         return;
   }
}

/*   rdintr - the RAM disk does not generate hardware interrupts,
 *   so this routine simply prints a warning message and returns.
 */
rdintr ()
{
   cmn_err (CE_WARN, "RAM disk took a spurious hardware interrupt.\n");
}

/*  rdprint - send messages concerning the RAM disk to the console.
 */
rdprint (dev, str)
   dev_t  dev;
   char     *str;
{
   cmn_err (CE_NOTE, "%s on Ram Disk %d.\n", str, RAMDISK (dev));
}

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video