FREE Subscription to Dr. Dobb’s Digest: Same Great Content, New Digital Edition
Site Archive (Complete)
Embedded Systems
Email
Print
Reprint

add to:
Del.icio.us
Digg
Google
Furl
Slashdot
Y! MyWeb
Blink
September 22, 2006
Migrating from 8-/16-bit to 32 bit: Lessons Learned the Hard Way

Kavitha Sundaram, Premier Evolvics
Kavitha Sundaram reviews the factors that need to be considered when transitioning from 8-/16-bit MCU to 32-bit MPUs, highlighting Linux OS portability and the hidden costs involved in such migrations.
Processors are becoming more powerful both in terms of the MIPS and the bandwidth of the data they can handle. They are equipped with most of the peripherals to make them resemble a System on a chip (SoC). As the complexity rises it is difficult to comprehend and it is necessary to move to higher abstraction levels.

The Developers become more dependent on Tools - including an IDE, Compilers, JTAG debuggers and all that "once fancy" stuff. The level of abstraction also has to be increased since it is difficult or takes a longer and steeper learning curve to have a grip on the underlying processor architecture.

Though it is attractive to jump into a 32 bit architecture from the existing 8/16 bit design capability for the amazing price- performance figures, there are hidden pitfalls which you may encounter.

This article is excerpted from a paper of the same name presented at the Embedded Systems Conference Boston 2006. Used with permission of the Embedded Systems Conference. For more information, please visit www.embedded.com/esc/boston/

The transition from 8-bit to 16-bit is not explicit since the 16-bit architectures never superseded the 8 bit Segment. Among other things 16 bit CPUs are often downsized to handle 8 bit peripherals, reducing the reduced the data throughput because of resulting I/O bottlenecks. There are also 16 bit Digital Signal Processors, which have become more popular than their 16 bit MPU conterparts because of the need to handle wider data in computation in signal processing applications.

Now we have suddenly come into an era where 32 bit processors are cheaper than 8 bit counterparts. This market push with the ever-increasing features has presented the developer with some difficult 32 bit Processor System design choices in many applications in consumer applications and additional features, connectivity and in Industrial Applications.

There are a few cores that have gained wide use and licensing across different vendors. Indeed, the ARM 7 and ARM 9 cores have been considered the "8051 Core" of this decade based on the popularity and availability from different vendors.

There are some specific advantages in choosing such widely available architectures. A developer can switch across different vendors for any upgrade or additional features. There is usually a roadmap associated with the hardware from a specific vendor but also a migration path across different vendors.

Another advantage is the Skill Set available in working with these popular cores. This helps reducing the learning curve and initial startup delays. These skills can also be used across different applications and hence the initial Setup Costs for the Tools could be amortized across different products under development.

Architecture
In the transition to these more powerful CPUs, the first thing what you need to look into is your familiarity with or basic knowledge of the 32 bit design architecture chosen. It is a misconception if you assume that when if you equip yourself with all necessary tools you can move forward without the basic knowledge of the architecture.

It is easy to get carried away by the statistics displayed in the websites that the processor family occupies a substantial share in the market. The architecture that is widely used in consumer electronics may not be a very good choice for the automobile sector. This is because the requirement differs based on the application. And when you delve into specific categories of industrial application you might be even the first time user of that specific peripheral or a functionality.

The Functional Block Diagram of the processor under consideration needs to be closely examined and understood. There are new terminologies associated with a 32 bit processor compared to a 16 or a 8 bit Processor. Hence you need to understand the abstract functionality, the constraints and additional features. There are a host of peripherals, which have not been a part of 8/16 bit design scheme. Security is also an important issue even in portable and embedded devices. Also media related extensions, internet connectivity and power management have become important considerations.

When you choose a processor architecture you should have a feature comparison against the competing architectures and arrive at a decision based on y our application requirements. The functional and non-functional requirements have to be addressed. The supplementary specifications also need to be considered.

It is easy to be misled with the MIPS value or with industry standard benchmarks for some specific algorithms. Some of the cores are optimized for specific algorithms based on the Hardware, the Pipeline or the Cache mechanisms. So it might be surprising that after implementation, the processor has not been able to cope up with the execution times the application demands. But this will be too late and disastrous if we need to change the processor, even if there is an upgrade path. So it is essential to identify the Key functionalities the application would need to handle and map the Processor and its peripheral features for the compliance.

ARM has evolved into a variety of cores, currently available on the lower end being ARMv4 and going to ARMv7. The 32 bit Instruction Set Architecture operating in the 32 bit space has been the central to all the cores. Additionally the 16 bit Thumb Instruction set was introduced to optimize the code generations since 32 bit instructions are not consistently required. The correct mix of 32 and 16 bit instruction set for code optimization and execution speed has been the key to success in the Thumb implementations. To cater to the signal processing needs the relevant signal processing instructions have been added. Further versions support Java and multiprocessing handling.

All these upgrades are driven based on the industry needs and requirements. Because of the convergence of applications the mobile device is evolving similar to Personal Computer in the previous decades. The silicon needs to support the new challenges ahead because of the portability and power consumption still being the driving features. Different variants of the ARM cores have emerged targeting specific segments. Also ARM cores are being widely considered as one core in dual cores such as the OMAP from Texas Instruments. Dual Cores can offer a twin benefit of the individual cores themselves without any compromises.

Development Boards
Once you have short listed candidate architectures for your applications you need to look into the Development Boards that the vendors are offering. You have to be very careful on the features of the Development Board and the deliverables associated with the purchase of the board. You could easily end up finally in having the Board alone with few example programs and a monitor and nothing else.

These might take up a few thousands of dollars and on the receipt of these you will be surprised to find that you might need to buy the operating system, the compiler and debuggers separately. This might involve another substantial investment without which you will not be even able to start the development.

The Development Boards have a few technical issues which are not anticipated either because of lack of information or just to push it to the market and cashing on the early bird incentive. These issues could be software related or hardware related. Some of the peripherals might not work in specific modes or there might be a case that a Power Supply regulator is not "Heat Sinked" properly.

If the Development Boards are done by Third Party, then you need to shuttle between the vendor and the third party and finally get a mail stating that this might be the issue and the fix is being worked in the next hardware version of this board. You will be lucky if the issues are only software related since we might get a patch for rectifying the issue.

Any time line that is predicted on a ready-to-use Development Board is normally twice than anticipated. If the architecture or family is very mature this overrun could be lower but still there would be surprises every now and then. The Design Team would have planned to execute and test most of the code using the development board as such. In many situations, you need to make additional hardware or adapt our application to what is available in the development board. These activities mainly contribute to the time and resource over-run.

The other benefit of buying a Development board is the Board Support Packages (BSPs), which encompass the ready to use codes and binaries cross-compiled for the specific platform. This definitely gives an edge rather than starting everything from scratch. The Embedded Linux has been possible because of the possibility to strip down the extra flesh and adapts itself to the smaller footprint.

Some of the device drivers are found as a part of the board support package. Many clones of it are available in the Linux development community. The generic drivers are available readily and also do not have serious implementation issues. It is also possible that some instances the device drivers are available in Windows and not in Linux.

<>Touch Screen Controller Drivers
Considerable effort needs to be expended to develop such functions, which might consume as much as a man month for development and testing of the Touch Screen Controller Driver including the necessary calibration routines.

The other time consuming issue is the Serial Protocols. Linux provides very good support for many communication protocols. This applies for standard protocols and may be difficult to implement proprietary protocols. You will also come across proprietary protocols with 9 bit addressing feature in a multi drop RS 485 connections.

If you need to support this you need to tinker the kernel and the device driver provided the processor core is supporting this mode. Othewise you might have to design a wrapper with an interim Repeater that can seamlessly convert 9 bit into 8 bit if the other communication partner is a legacy system. Similarly for half duplex RS 485 control the direction pin control has to be specifically addressed since most of the serial drivers can support full duplex either RS232 or RS422.

Also handling parity bits, multiple checksums and serial timeouts could be time consuming. Some of these issues might look simple and straight forward but might take any where between 10 to 15 developer days for implementation and testing.

Reference Designs
When you move to your design after the hands-on trials with the Development Board you will often need to rely on Reference Designs either from the CPU by the vendor or supplied with the Development Board. Often the reference design could just be the schematic of the Development Board as such. In this case, if you are confident that you have a working model and you are sure that you can proceed with the reference design as such without much change. Of course you need to be in a position to tailor the design to your requirements.

In some cases you will not have tangible reference designs and it could be from an open community or from work groups. In these cases you need to check up the schematic as a whole to ensure that it would work in its intended mode, for which it had been designed. You need to look into the datasheets of the peripherals, the voltage and the timing specification in particular.

Since most of the high speed cores work with lower voltages, but still the external world works with higher voltages the conversion and the interpretation of these signals becomes critical. Most of the reference designs do not consider the real world or implementation issues. Reference Designs have to be taken just for reference and not for absolute utility.

They provide a quick start but you need to ensure that everything has been addressed for your application requirements. Sometimes it is possible that the part in the reference design is obsolete or phased out. In this case we need to work out the equivalent either from the same vendor or a different vendor and update the schematic. Also care should be taken in additional buffering and sometimes on the glue logic. Also the Reset Modes could be different in a reference design, which cannot be implemented, in the final design.

For some peripheral connectivity you need to make sure the end requirement first before deciding on retaining the peripheral schematic. To illustrate in applications requiring TFT LCDs you need to look into the controller IC with or without EDO RAM. Before you decide upon the peripheral you need to ensure the diagonal size of the LCD and also the pixel density. If this is varying or there is a possibility to upgrade the features available at a later date you need to look for the hardware compatibility, which can operate across the choices.

Peripheral Modes
Peripheral incorporate features designed to demonstrate the peripheral works in a particular mode. They do not have options to check in all the modes that the peripheral would function. When you have a requirement of a unique functionality for a peripheral you need to narrow down and check whether the necessary interfaces and modalities have been designed or you need to do it afresh. Some times the modes have been passed as parameter functions for the calling function to dynamically decide the mode.

When you have a User Interface system, designed for the modality of a typical user, Input entry is normally handled by the keyboard. The system is normally the standard Keyboard. In an industrial application you should may have to restrict the number of keys to limited keys. Then starts the difficulties.

You may need to change the navigation scheme itself which is considerable effort compared to writing the application itself. The TAB key is used widely in menu navigation and hence you may need to change this in the key handling routine. Because of the nested calls it is sometimes very difficult of track the program sequence and hence very difficult to change.

Some other modes like the Baud rate change may have to be changed manually by the user in run time, inviting additional effort. Changing master to slave or incorporating multi- master during run-time for specific communication peripherals will require modification in the device drivers.

Typical examples of peripheral behavior requirements are (1) handling real-time clocks and their possible synchronization efforts across a multi node network, (2) finding it necessary to invoke the power down and idle modes of the processor during power fail, or (3) switching between memory and the I/O mode in a Compact Flash.

There might be even other unique requirements like accessing the configuration memory like an EEPROM on a predefined sequence and switching OFF the backlight of the LCD when is not used are also specific functions that need to be done individually.

If your design includes an ADC, there could be plenty of modes including a sequencer, simultaneous conversion, over-sampling and averaging and auto conversion on triggers which need to handled in the device drivers and the application code.

Boot Modes
You need to look into the Boot Modes the Processor can boot and which one is applicable. Most of the industrial applications require Auto Booting to avoid Manual intervention and this need to be designed in.

In some systems you may have to initiate the Auto Booting sequence manually for the first time using a Debug Port or its equivalents. So necessary hardware for this cannot be skipped even in the production units. Also some times the Application and OS have to be loaded in a different mechanism from the Initial Boot Program.

For example, you might need an Ethernet connectivity, which might not required in the end application. Also it might be possible to boot the system from a network server. There is no thumb rule on the mode that has to be chosen. You need to look into the options and decide which is suitable for the application. A Repeater connected and communicating to a Server can be booted from the same Network Server.

The boot modes and the procedure also change substantially across different processor architectures. In some other cases the Processor has a Boot Program, which automatically takes care of these issues. But there might be one or two pins, which are given externally for the user to choose among the various options. There might be a default mode, which is not favorable for the application, and hence you need to figure out on how you invoke the boot mode. It could be that the reference design uses a different boot mode and hence you need to be cautious on our selection and its necessary implementation issues.

Flash Memory
When you work with self contained 8 and 16 bit processors you are not really worried about the Flash or the RAM. If you need to connect either the Program or the Data Memory the chip should have the necessary External Interface options. In these architectures the execution of the Program is directly from the Flash though sometimes you are given an option to execute a part of the code from RAM. This is normally in Digital Signal Processors where some specific critical code could be run from the RAM to speed up the execution cycle.

But in most of the 32 bit Processors the Program is normally moved from the Flash to the RAM after the initial boot up. So the RAM needs to be faster and also needs some synchronization with the processor. So SDRAMs - Synchronous DRAMs are required to be interfaced with the processor for executing the program.

You also need to decide whether you need to have a native NAND/NOR Flash or you need to additionally provide Compact Flash (CF) Slot. You can also have a Hard Disk or a Disk on Chip. This is based on the Program size inclusive of the Application and the Operating System. When you are not sure on the end size of the image you are going to generate it is advisable to have a CF slot option.

You should also look into the timing issues on upgrading the software in any of the Flash memory described above. They might be surprisingly high as long as 30 min to 60 min to program a 16MB Flash Memory. So for an Industrial Application if you anticipate field upgrades on the initial phases of deployment then it is better to have a CF slot so that you can replace the card with the new software rather than upgrade in the field. After the application stabilizes and freezes you could switch from the Compact Flash to the Program Flash IC. This will save costly downtimes and give us the flexibility.

Operating Systems
The next different issue when you move around from 8bit/16bit processors to 32 bit is the need for an operating system. You could have our own proprietary operating system on smaller devices because they are more than a sequence of instructions to CPU and controlling the peripherals.

If you wish to migrate to 32 bit it also becomes necessary to have due consideration for using an operating system rather than developing on your own. If the business segment dictates volumes and you would not like to end up in royalty issues then you can consider developing your own operating system. But for smaller project it is more or less like re-inventing the wheel. Other than this it brings in new issues, compatibility, up-gradation, maintenance and optimization.

So when you decide that you need to go in for an operating system that is available commercially off the shelf then, the choices within the competing platforms have to be considered. Should you go in for Windows or for Linux? The choice depends on many factors including the skill set available, the development cost, unit cost and timeline. If the royalty overhead for every shipped device is not desired then you need to look for royalty free OS.

Windows. There are still some ardent lovers of Windows who feel that Open Source is all jinxed and to have strict delivery schedules the Compact Edition of Windows Win CE is a better choice. Microsoft keeps promoting Windows Embedded to contain the growth of Embedded Linux. Windows CE is still the choice if the application is Graphics intensive. The development time is considerably lower in Windows CE on such applications. To address the market needs Windows offer different editions of its embedded version as Windows CE, Windows XP Embedded and Windows Embedded for POS (Point of Sale). The Microsoft details the deliverables and suggests the choice based on the application.

Linux. The usage of Linux in consumer and industrial embedded applications is emerging and growing at a faster pace. The first perceived interest to the developer is it is free and access to source code is provided. You might be able to make a quicker start since everything could be downloaded from the Internet. Initially it is amazing to look at the wide spread Linux community with exchange of information. All the forums, user groups and all the suggestions are really overwhelming. But when it comes to specifics from generics then there starts the braking and you need to shift to slower gears.

The second important consideration for choosing is the reliability of the Linux system, which is utmost essential in embedded applications. Linux being powered by the Open Source community is well maintained and integrity is assured. This is ensured that developers put in their best efforts when it is left to their choice rather than being a part of the employment objectives. Any fixes are handled by the open source community and made available at the earliest. This is in contrast to the branded versions where you need to wait for fixes and hope that they are free of charge.

Linux Kernel. Linux Kernel is robust and it compiles on a very attractive form factor of less than 1MB. The Kernel includes the Memory Management and Process Scheduler. The Memory Manager takes care in securing the memory sharing and management across different programs. The process scheduler allots sufficient CPU time to processes. The kernel also incorporates a file manager and a shell windowing system. When you do native compiling everything looks better and so you are not poised for any setbacks.

When you need to have it cross-compiled to your target platform you start encountering problems. Some of them could be solved directly with some additional effort, references from the examples and illustrations and on specific queries or FAQ in the vendor site. ARM and Linux go comfortably well and still you need to rely on the Board Support Packages delivered with our Development System purchases.

The stable and the development versions of the Linux kernel are available for downloading from the Internet. The stable versions are even numbered like 2.4 and 2.6 while the development versions are odd numbered 2.5. It is always advisable to use stable versions especially for embedded systems. The kernel versioning also includes a patch level numbering to indicate revision status such as 2.4.19 with the higher number stating the latest revisions.

The Linux kernel is monolithic in nature meaning that all the core functions and the device drivers are a part of the kernel. The functions inside the kernel are invoked by system calls from the application. The kernel handles all interrupts and exceptions. The kernel also is responsible for switching between tasks and hence is the multi tasking master.

Building the kernel involves a minimum of three steps, building the dependencies, the image and the modules. You could either compile or cross-compile based on our requirement. If it cross-compiled successfully then it needs to be downloaded to the target for its execution and verification. It is always advisable to configure the kernel and cross compile with TCP/IP support enabled.

GUI Applications
User Interfaces are typically devices with input and output capability and with mild time constraints. The output capability is determined by the LCD screen size and the actual contents displayed and refreshed.

If you have an application with User Interface then you need to look into the Graphic Windows Capabilities. When compared to Windows will still find the options to be slightly inferior. You have choices on GUI, but you mush also consider whether or not to pursue an application with a royalty free similar to the OS. When you develop an applications and start cross compiling you again end up into enormous sized Ram Disk Images.

You need to optimize the GUI Applications by isolating the libraries you do not intend using for the application. This requires lot of effort from the design team since the documentation is not clear and explicit. You need to adopt trial and error techniques to finally achieve your goal. This puts in several iterations before we are finished. The Fonts, Unicode Text formats and unused libraries need to stripped. This might sometimes be as high as 35MB which could be even twice the size of the GUI Application.

GUI Applications hence need to be carefully designed for the screens and the navigations create the first impression about the product. A sluggish navigation will spoil the credibility of the product. It is necessary to look into the real estate of the screen and populating the parameters for considerable execution and refresh rates. Care should also be taken with incorporation of the right mix of the text and the graphics.

Linux Distributions
Lots of development effort is spent mainly on the setting up of the development environment, optimization, build and rebuild process. So you may be tempted to get away with these chores by buying a distribution off the shelf. There are no thumb rules for the decision of purchasing a distribution. The embedded versions involve substantially a high amount of investment.

Also to be considered is whether they are available only to broad or narrow targets. If you are used to the Linux distribution and then switch targets you may need to reinvest again. Distributions definitely help in structured optimization but we become dependent or tied to the Distributor packages. Flexibility in configuring and automation of the process are the additional benefits you derive from the purchase of the distributions. They also serve for better understanding by the way of the documentation they provide.

It might most probably happen that after trying hands with the Open Source model and not able to meet deadlines you might be compelled to try out some reputed distribution for speeding up the whole process. In this case the project is delayed and additionally investment is incurred at the fag end of the project. So it is better to have a look at the project schedule and make investment decisions earlier in the product life cycle so that you are comfortable in adhering to the time schedule.

Conclusion
We have covered up a few and significant pitfalls that could be easily avoided on careful consideration. The move ton 32 bit processors involves lot of managerial and engineering decisions, which need to be effected timely and correctly. Some of these facts have to be learned the hard way and preserved as lessons learnt.

Kavitha Sundaram is manager of research and development and head of firmware development at Premier Evolvics Pvt Ltd. in Coimbatore, India.

RELATED ARTICLES
No Related Articles
TOP 5 ARTICLES
No Top Articles.



MICROSITES
FEATURED TOPIC

ADDITIONAL TOPICS

INFO-LINK