.NET

The KernelGraphics Interface

By Andreas Beck, July 01, 1998

The General Graphics Interface (GGI) project brings safe, fast, and portable graphics to a variety of platforms and operating systems. Andreas describes KGI, the kernel-level component of the Linux version of GGI.

Dr. Dobb's Journal July 1998: The KernelGraphics Interface

Andreas studies physics at the University of Düsseldorf, Germany. He can be reached at [email protected].

Sidebar:The EvStack Kernel Enhancement

Determining how an operating system should handle graphics is an exercise in tradeoffs. If you are interested in the fastest possible graphics performance, the only solution is for your application to work directly with the graphics hardware without regard to security. However, if you are willing to sacrifice a little bit of speed to gain portability and a degree of safety, GGI could help you a lot.

The GGI (General Graphics Interface) project (http://www.ggi-project.org/) is intended to bring safe, fast, and portable graphics to a variety of platforms and operating systems. GGI consists of user-level libraries of basic graphics functions and kernel-level drivers that handle the low-level graphics routines. The Kernel Graphics Interface (KGI) is the kernel console interface upon which the Linux implementation of GGI is based. Figure 1 shows how GGI and KGI are related. In this article, I describe the motivation, architecture, and implementation of KGI.

GGI is not confined to Linux, nor to KGI as the display subsystem. LibGGI is a lightweight graphics library that runs on a variety of platforms and graphics subsystems like X-Windows (tested on Solaris, AIX, IRIX, Linux, and others), SVGAlib (Linux), or other native graphics interfaces like the Sun framebuffer device. Ports for more targets (such as Microsoft Windows) are in the works.

The Problem

The job of an operating system is to arbitrate access to hardware to preserve the stability of the system, prevent software from damaging the hardware, and provide the software with an abstracted view of the hardware.

Few operating systems do this properly for graphics cards. Graphics support is either placed entirely in the kernel (like NT) or is left to user-mode applications with special permissions (like traditional Linux SVGAlib or X applications).

From a security point of view, there is nothing wrong with placing all graphics functionality in the kernel. The problem is that it vastly increases the kernel size at the expense of stability. Video drivers become more difficult to write and especially to debug -- and errors in the drivers impact system stability.

On the other hand, the SUID root approach used by X and SVGAlib presents some security hazards. In general, you want to avoid running any applications as SUID root, since buggy or malicious code can easily be manipulated to break into, or simply break, a system.

A malicious, or merely carelessly programmed, graphics application can easily hang the system by causing a bus lockup (possible with many graphics cards due to bad programming), leaving the console in graphics mode (making it hard to use the system), or locking out virtual console switching. Worst of all, a malicious application might even be capable of damaging hardware by programming unsuitable clocks, thus overloading the RAMDAC and/or monitor. While most modern monitors have protection circuitry for this, RAMDACs are usually without defense.

X circumvents this problem somewhat by being a client-server system, which protects the privileged server from malicious or buggy user code. Yet even then, it is still possible to abuse the X server, for instance, to read any file on your system (see http://www.rootshell.com/).

SVGAlib is a bigger problem, because its applications must be SUID root. Consider the binary-only releases that are necessary for commercial games but must run SUID root. Would you trust all vendors not to spy on your system? Would you always check PGP signatures to make sure you don't have a hacked copy with some Trojan Horse? Even worse, normal users can't develop SVGAlib applications since root access is necessary to give appropriate permissions to the executable so it can be tested.

The Solution

KGI tries to address these problems by moving only the critical part -- the actual programming of the graphics hardware -- to the kernel. This reduces the security problems to those that any UNIX device exposes: inappropriate file system permissions and bugs in the driver.

KGI does not do the actual drawing in the kernel. It's not necessary, and doing so would increase the possibility of errors that are even more serious when they happen in a kernel context. The KGI driver is designed to be a thin layer around the hardware functionality. It only abstracts functions that are fairly standard between different cards.

Functions for setting up modes and some common accelerated drawing functions are available via a standard command API, while card-specific quirks are exported in a private command area that is called by a card-specific user-mode counterpart.

Implementation Considerations

Speed is the main problem with a graphics interface that is at least partially running in kernel mode. If you needed to make a kernel mode call every time you called a basic function like drawing a pixel, the system would crawl.

Fortunately, almost all available cards have some notion of a framebuffer, a portion of the onboard Video RAM (VRAM) mapped into the CPU's address space. Accessing the VRAM is normally considered a safe operation. Some hardware accelerator registers are mapped to VRAM, but these can normally be excluded by the kernel code via the MMU of the host CPU.

From user-mode, the KGI driver API exposes a command interface that needs to do a user-to-kernel transition (under Linux, an ioctl call to /dev/graphic), and a memory-mapped linear framebuffer, a continuous area in RAM that represents the VRAM contents.

Not every graphics card has a linear framebuffer. However, as those of you who are familiar with DJGPP may know, there is an elegant solution for this: the MMU. If the card exports a banked-style buffer (for example, a 64K window at 0xA0000, as old Trident 8900s did), it is mapped at the appropriate place in a virtual memory area as big as a linear buffer of the card would be. The other areas are marked to be swapped out. If such an area gets hit, the driver is notified, moves the card's window accordingly, and corrects the mapping.

There are some speed problems with this, because the MMU trap is expensive compared to just setting the bank with an "out" instruction. At the same time, due to the design of most such cards, we cannot export the banking register to user space anyway, because of security considerations (it is normally on an indirect register that also hosts CRTC timing, and so on). On the other hand, this approach leaves bank-crossing-detection to the MMU and thus saves unnecessary (sometimes nontrivial) checking code.

Now, we have a decent and fairly fast interface for all common tasks. All really primitive things that are not worth the overhead to call into the kernel (DrawPixel, very short lines, and so on) are performed via the MMAPed VRAM. More complex and administrative functions are performed via the command interface.

One other catch is that you probably do not want to write any emulation code into the drivers for cards that do not have a particular function accelerated. Microsoft's DirectX handles this problem using capability bitmaps. Having capability bitmaps means that you can query to see if an acceleration function is available via some kind of a bitmap or test for a NULL pointer. In our opinion, this is too hard to extend, because you have to extend the bitmap or table with every new version, making lots of revision checks necessary to see if a particular capability is accessible in a given revision at all. So we chose another way to handle software fallback for our acceleration code.

An accelerated function call always returns a status code that either says "completed successfully" or an error code that suggests what to do instead and also how long that information is valid.

The suggestion can say:

CANNOT: This is returned for hardware-specific operations or context-sensitive operations (for instance, trying to set the frame for video overlay on a board without such capability).
USE_LOWER: This is used when it is most likely a good idea to use a set of simpler acceleration calls (for example, using multiple horizontal lines to draw a box), because the resulting calls would be accelerated.
USE_MMAP: This is returned when no simpler accelerator calls are supported. Thus, it is advisable not even to try them, but rather to directly draw on the MMAPed VRAM.
The expire information tells how long this information is valid. This allows us to avoid having to call the accelerator function each time for cases where a certain accelerator function may be only temporarily unavailable.
NOW: Retry next time. It can't be done just now, because the accelerator is too busy or some similar problem that is likely to go away the next time the function is called.
GC: Retry when the graphics context has changed (for example, if the accelerator cannot draw with a given raster operation).
MODE: Retry when the mode has changed (that is, if the accelerator cannot be enabled in a specific mode as in the VGA compatibility modes of many common accelerators).
ALWAYS: The accelerator never has this capability.

The advantage of handling software fallback this way over a DirectX-style bitfield is that this is extensible in a compatible way on both kernel and user sides. A newer KGI driver will know some new command codes that older libraries won't know about. So, you could lose a bit of extra acceleration with older libraries, but it's better than being incompatible.

A newer library may use some command codes that are not supported by older drivers. This triggers a "default" case that deals with the commands and always returns ENOSUP_ALWAYS_LOWER or ENOSUP_ALWAYS_MMAP (depending on whether or not the driver has a reasonable base set of accelerated commands). This return code causes the library to permanently disable the accelerator call after the first try and use an emulation routine instead. Again, you may lose a bit of potential acceleration if your kernel isn't up to date with the library, but it still works.

Enhancements

While the scheme described earlier is enough for normal applications, there have always been some drawbacks to this approach:

Only relatively common acceleration commands are supported. Adding all the card specifics would result in an incredible number of commands. In addition to the number of commands increasing astronomically, it is very possible that different accelerator functions in two different drivers could end up using the same command codes, as drivers are developed independently.
There is no direct way to get at the acceleration registers, even if this is otherwise safe to do (which it is for a few very high-end cards).
Some cards have multiple memory areas for textures, overlays, and so on.
There is no way to support display lists or similar things that would dramatically reduce the number of user-to-kernel transitions and, thus, overhead.

To overcome these limitations, KGI allows exporting additional API functions that allow you to circumvent these problems:

Private commands. KGI reserves an area for private command codes. These are handled by a card-specific library in user space to make the best possible use of the card.
Mapping of card Memory-Mapped IO (MMIO) areas, or eventually allowing access to the card's ports if this is safe (up to now, we have not found cards where port access is safe). Here, too, card-specific libraries are used to convert the card-specific API represented by the MMIO area to the common API.
Mapping of cards' additional memory areas like texture memory, YUV overlay planes, and so on.
PingPong buffers, which are simply filled with commands (all in user space) and then executed with a single command (one user-to-kernel transition). This operation can be done asynchronously with the program continuing to execute on the host CPU, while the accelerator is fed with commands using either DMA, accelerator-generated "accel-idle" or "accel-buffer-lowwater" interrupts, or host-generated timer interrupts. This allows for maximum throughput, as the host CPU can prepare the next drawing commands while the accelerator is still drawing the last batch.

Multiple APIs and Libraries

I have talked about having multiple APIs. How do you know which particular APIs are present and how to make use of them? How do you avoid a horrible mess where the applications must know all of the APIs?

This is one of the reasons for LibGGI, which consists of a basic stub library and a rather large bunch of API libraries that build the bridge between the various hardware (or software -- LibGGI can also be used to display in an X-Window) APIs and the LibGGI API. When setting up a mode, LibGGI asks the target (KGI in our case) for a list of the exported APIs, a set of strings that classify how you can access various card features. Figure 2 shows a typical API list. The meanings of the strings, which are listed in increasing order of precedence; see Table 1.

The libraries are loaded in a way that allows more specific functions to overload the more generic ones, automatically yielding a startup configuration that always uses the best available function. In some cases (as with the ioctl API), these entries can be altered at run time if functions are not available.

One problem remains. LibGGI can only make use of functions that are needed for implementing the LibGGI API. If you look at these functions, you will realize that they account for few of the functions a card can support.

We have decided to keep LibGGI small to save space for simple applications and things like embedded systems. For more complex functions, LibGGI allows the registration of extensions like LibGGI2d and Mesa-GGI, which add support for the APIs necessary for specific tasks.

Implementation Details

Additional goals with the design of KGI included:

Easy driver writing.
Modular design for cards that are made from similar components (S3 cards with different RAMDACs, clocks, and so on are a good example).
A simple way to enhance drivers for fairly compatible future generations of known cards.
Full abstraction from the operating system for easy portability.

These are achieved by using a modular design approach that makes every KGI driver consist of six basic modules:

Chipset module. This controls all functions related to mode setup, CRTC programming, RAM timing characteristics, interfacing RAMDAC and Clock, and so on.
Clock module. This controls the pixel clock generation. This is separated from the chipset driver, as there are cards (S3, for instance) that have the clock as a physically distinct chip, with the different cards made by combining basic chipset, clock, and RAMDAC chip in different ways.
RAMDAC module. The RAMDAC modules is similar to the clock module, but controls the RAMDAC features like palette setting, VRAM-bus activation, RAMDAC-internal hardware cursors, Gamma correction, and the like.
Graphics (accelerator) module. Some chips have the acceleration engine either detached from main chipset or use the same or very similar acceleration engine on different chipset versions. Thus, separating acceleration programming from the other aspects of the card makes sense (that is, all newer S3 cards can be run with the S3 generic acceleration driver). Not all of the capabilities of very new cards would be used, but driver development is eased quite a bit, since you can try out your new chipset driver without having to write a graphic module.
Monitor module. What features are there in a monitor that will need a driver? At the very least, such things as timing limitations, ensuring that the image is centered on the screen, power-saving capabilities, and more. Being able to use any of these requires some knowledge about what the monitor supports. The monitor driver allows safe access to these features, and automatically chooses suitable timings.
Kernel module. This does the interfacing to the host OS. It implements access methods to the hardware, to PCI services, and so on. In theory, we should be able to run the same KGI driver on different operating systems by just linking with a different kernel module. (We have not yet tried this because we are currently restructuring the Linux console. Porting efforts now would result in a lot of duplicate work.)

Conclusion

What does Linux gain by using KGI? First, the graphics card is handled like any other device, which means that arbitration and access to critical registers occur in one central place -- the kernel.

Second, since the kernel is able to control the graphics card, we have a few new capabilities:

A real Secure Attention Key (SAK) that can kill off graphical applications safely because the kernel itself is able to reset the graphics card to a sane state.
Simple and safe resizing capabilities for VTs. For example, with KGI, you can implement VT100 ESC codes that were impossible to implement without these resizing capabilities.
Support for graphical consoles, thanks to the new EvStack kernel enhancement (see the accompanying text box entitled "The EvStack Kernel Enhancement"). This is immensely desirable for hardware that has no VGA-like text mode or for languages that require the ability to represent more than 256 characters.
The ability to operate the graphics card in MMIO mode, which means that the registers of the card are mapped to a programmable place somewhere in the memory address space, thereby freeing the VGA registers in IO space. As a result, Linux/KGI is multihead capable with cards that support that feature.

Third, together with LibGGI, you have a lightweight, portable, and fast graphics subsystem. (A single-disk demo that uses a mere 700-KB compressed image is available electronically; see "Resource Center," page 3, or my home page at http://sunserver1.rz.uni-duesseldorf.de/~becka/.) This is of special interest for embedded systems, which can now use Linux instead of relatively expensive and less open ("nice README, but where is the source?") solutions like QNX or Windows CE.

Finally, you will no longer have dangerous SUID root graphics applications. The GGI project has developed both a wrapper library that allows most SVGAlib applications to run without root permissions, and a replacement X server called Xggi.

Resources

The GGI homepage (http://www.ggiproject.org/) contains snapshots of the latest source, instructions on how to obtain them via CVS, links to GGI-relevant web sites, and up-to-date information about the project. Our mailing list is hosted at ggi[email protected]. Subscription information is found on the GGI web site. If you plan on subscribing, be prepared -- the list has high traffic.

Acknowledgments

Thanks to the GGI development team, especially Steffen Seeger, Jason McMullan, Emmanuel Marty, Ben Kosse, and Michael Krause for their work and for reviewing this article and correcting several glitches. I'd also like to thank S3, Cyrix, 3Dlabs, the FLUG for providing the GGI development team with information and donations, and all the users and testers of GGI.

DDJ

1 2 3 4 5 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

.NET