Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Rating Real Time: Design Points


Aug01: Embedded Space

Ed is an EE, PE, and author in Poughkeepsie, New York. You can contact him at [email protected].


An automotive design team can produce a Formula One race car or a minivan, depending on the design point they're given. Although either vehicle can fetch groceries, only one handles speed bumps with aplomb. Conversely, power sliding doors won't cut it in the next Grand Prix.

When engineers and programmers sit down to design a project, they have in mind both a typical use and a typical user. That mindset, the project's design point, determines what's included and, perhaps more importantly, what's left out of the final product. Four tires, check. A Formula One passenger seat, nope.

As with automobiles, so too with software. The initial design point determines the project's far-distant future, because fundamental design decisions make some subsequent changes trivial and others exceedingly difficult. Several vendors discovered, to their evident surprise, the complexities of grafting a windowing user interface atop DOS on the good old IBM PC and its successors. Their projects met with mixed success, even if some survived long enough to encounter troubles with their own initial design points.

To get a better idea of why this happens, let's take a look at some design points. We'll begin with the silicon that makes it all possible.

Hardware Matters

At the most basic level, the hardware design point determines what you can accomplish. You could, I suppose, emulate a 32-bit CPU using nothing but the 8085 in that logic analyzer we met last month. Talk about an exercise in futility!

The Transmeta saga provides a contemporary example of how a particular hardware design point plays out. You'll recall that Transmeta began with the intent of building an Intel-compatible CPU based on a VLIW (Very Long Instruction Word) architecture, using firmware to interpret x86 instructions and optimize the resulting sequence of microinstructions (they pronounce this "Code Morphing"). Their goals were to achieve roughly equivalent performance with Intel processors for a given CPU clock frequency, with significantly reduced power consumption in smaller and simpler hardware.

The first version of their Crusoe chip, the TM3200, came as a serious surprise. They evidently based their hardware and firmware architecture on the assumption that most contemporary x86 software favored 32-bit opcodes. That turned out not to be the case for either Windows or its applications (the only code that matters in the consumer marketplace today), which ran with, shall we say, serene dignity.

The uniprocessor variant of Amdahl's Law tells us that when your CPU has a bunch of peppy instructions with a few dogs mixed in, them dogs gonna getcha. For example, if almost all instructions run in one clock cycle, but a few instructions soak up 100 cycles just 1 percent of the time, the average instruction requires two cycles. Those rare instructions add 100 percent to your cycles-per-instruction number and hack your performance in half!

Which is essentially what nailed Transmeta. Despite claiming that they could fix problems and add optimizations just by tweaking the x86 interpreter firmware, their 32-bit-specific hardware-design point set the TM3200's performance. The fact that it remains a subpar performer indicates that firmware can only accomplish so much on a given hardware substrate.

Their later TM5400 and TM5600 Crusoe chips embody different implementations of the same VLIW architecture. Despite their hard-won knowledge, the overall performance remains well below expectation: Code Morphing evidently imposes far more overhead than they expected. This implies that the original VLIW design point doesn't match up well with the realities of high-speed x86 instruction execution.

Transmeta is not alone, however. If you take a good, hard look at the CPU chips and cores now promoted for embedded use, you'll recognize quite a few that began life as desktop engines or x86 killers. After discovering how the desktop market values compatibility and performance above all other factors, the chips wound up in the embedded market, where they compete fiercely on speed and power consumption.

Homework assignment: Read up on the history of RISC CPUs, then write an essay describing code compaction and explaining why it took on added significance in this era of unlimited RAM. Extra credit: Chart the history of power conservation measures and explain why "sleep mode" remains so deadly on desktop PCs.

Kernel Concerns

Snuggled up against the hardware lies the operating system. Inside the OS kernel, the code that governs the most fundamental actions of the system, you'll find the original OS design point affecting all subsequent decisions.

When you begin laying out an OS kernel with embedded and real-time capabilities, you inevitably favor the faster over the slower, the specific over the general, and the simple over the complex. The resulting kernel emerges both smaller and simpler than a general-purpose OS, with many functions either moved elsewhere or simply omitted.

You will find that hardware registers and capabilities determine the size of buffers, the length of messages, and the number of status bits available. Fixed arrays and known-size buffers will replace intricate chains of pointers and complex allocation schemes. Nearly always, speed and predictability trump anything else.

In fact, for embedded systems below a certain level of complexity, you may decide you can get along just fine without a formal operating system at all. Interrupts can go directly to the routine that handles them, task dispatching reduces to an endless loop inside main(), and interprocess communications uses shared variables. It's been done, even if it's not mentioned in polite company.

Although it's hard to imagine in these days of gigahertz CPUs and gigabyte RAM, many embedded applications must process real-time events using only countable amounts of RAM and glacial clocks. Under those conditions, a true operating system may be an unaffordable luxury. Chilling thought, no?

Because embedded projects span such a range of capabilities, with real-time systems just a subset of those, you'll find suitable OS kernels defined more by what they leave out than by what's included. If the OS design point included everything, removing the kitchen sink may rip out some vital toilet plumbing.

For example, even though your project has no need of a file system, the OS may assume it boots from a disk. Or it may assume virtual memory is always available and always pages to disk. Or it may have another gotcha that's possible to work around after it smacks you upside the head at the least opportune moment.

Better, perhaps, to start with a minimal system and add components in a building-block manner. If you must nit-comb a larger system, you generally wind up removing functions until it stops working, then add that last hunk back in again.

Or maybe not. It's a matter of matching the OS design point with yours.

Code Distillation

The OS kernel is not, by itself, the entire OS. There are at least three additional layers: Capabilities added to the kernel, interface routines providing access to them, and utilities that make use of everything else. Once again, the original OS design point determines how that code will perform.

A key difference between embedded and desktop developers lies in mindset. When you're writing for a system without all the modern conveniences (no sink, outdoor toilet), you tend to write smaller and tighter code because you know what's vital. Desktop code may not necessarily be bigger and slower, but that's the way to bet.

Knowing that programs have access to essentially unlimited amounts of memory and fast processors definitely simplifies software development at the cost of larger memory footprints, more complex OS internals, and substandard execution on anything less than current hardware. Programs and utilities written for embedded applications tend to assume the converse — limited memory, slower CPUs, and fewer services.

Now, while it's difficult to come up with reliable with-and-without figures for something like an OS kernel, we can compare near-OS code and utilities from the Linux arena. Keep in mind that your mileage will definitely vary!

Consider this data point: The uClib run-time libraries sweat a statically linked version of the "Hello world!" Ur-program down to 2 percent of its usual size by omitting features and functions that desktop programmers take for granted and embedded programmers rarely use.

In general, static linking isn't a win because desktop programs assume the standard shared libraries are available through dynamic loading. Unless you omit all those programs, you must keep both the libraries and the OS loading facilities. However, for embedded systems that run only a very few, very carefully controlled programs, static linking pays off. As always, it depends on your system's design point.

The BusyBox project provides a second data point. BusyBox combines the myriad GNU command-line utilities required to actually get something done on a UNIX system into a single executable. That file may be a 400-KB hunk when it's statically linked with uClib, but it replaces 100-odd separate files, each weighing around 40 KB.

Do the Math!

For both uClib and BusyBox, the new code's design point included "small and simple" and excluded "general purpose." Combining similar features, reusing code, and eliminating seldom-used functions produced much of the compaction. The ensuing code became smaller and faster, even though it shares a common heritage with the usual sources.

Yes, a different design point can produce strikingly different results!

Reentry Checklist

The Protean nature of software might lead you to believe that mutating a desktop OS into a real-time, embedded contender requires just a few tweaks. With the notion of a design point in mind, next time we'll examine just what's involved.

I'd like more fundamental knowledge of how the Crusoe's hardware and firmware work, but after more than a year, the Transmeta web site (http://www .transmeta.com/) still says they'll make the Crusoe hardware-designer information package more widely available Real Soon Now, and they have not yet announced performance numbers. My opinions are based on experience I've had with VLIW machines and what I've read here and there on the Web.

Linux aficionados can get an idea of what an entirely different OS design point looks like by reading through the QNX manuals and white papers (http://www .qnx.com/). You'll see considerable cross-pollination above the kernel level, but striking differences below that line. As they point out, "POSIX" doesn't necessarily mean "UNIX."

The Linux Documentation Project provides kernel doc at http://www.linuxdoc .org/. The background info on tweaking and tuning also shows off some of the initial kernel-design points. Of course, you can always peruse the kernel source itself from any distribution or fetch it online from http://www.kernel.org/.

A graph showing the BusyBox executable file size against time appears at http://busybox.lineo.com/. In this case, you can't call it "feature creep" because the design point specifically includes modularity, making it easy to configure exactly what you need and omit features that you won't use.

The uClinux project lives at http:// www.uclinux.org/, which leads you to the uClib code.

Search on +"Seymour Cray" +"virtual memory" for a dissenting voice on the benefits of virtual memory. He was talking about supercomputers, but don't ignore the key point. Great men create pithy sayings; Cray was no exception.

The IEEE provides a remailing service for members, so despite being a member for these many years, the ieee.org domain in my email address doesn't imply I either work for or represent them.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.