Embedded Systems

Reliability & Embedded Networks

By H. Thomas Richter, August 01, 2000

Reliability requirements for embedded networks are more critical than with nonembedded networks. Thomas presents a new method for ensuring reliable communication between hosts in small networked environments.

Aug00: Reliability & Embedded Networks

Thomas is a systems programmer at IBM Böblingen (Germany). He can be contacted at [email protected].

There's no question that reliable communication is a fundamental requirement of all networks. Nor is the situation any different when it comes to networks with embedded processors. In fact, the reliability requirements for embedded networks, where a human operator isn't available to monitor the system or a life-threatening situation can occur if the network crashes, can be even more critical than those for nonembedded networks. In this article, I present a new method for ensuring reliable communication between hosts in small networked environments. I am currently using this method on a system based on an IBM PowerPC 403GCX and two National Semiconductor NS 83905 10-Mbps Ethernet chips running on OS Open, which is a real-time operating system from IBM. Applications use the BSD4.3.2 Reno TCP/IP implementation for communication. No application change whatsoever is needed to use the new features introduced in this article. The switch of the physical network is transparent to the TCP/IP stack built into the operating system and is handled by the Ethernet device driver. Since this is a software solution, all existing hardware can be reused. The complete source code -- enetlib.h, enetswap.h, and enetlib.c -- for the device driver is available electronically; see "Resource Center," page 5.

When implementing the techniques presented here, I had three general design goals:

No changes to the application code. Applications should run unchanged on this network infrastructure and remain unaware of faulty interfaces or networks.
No (or at least minimal) code changes to the TCP/IP implementation.
No disruption and no loss of data for existing TCP connections.

More specifically, two innovations made it possible for me to achieve these design goals and still maintain a high level of reliability:

A new device-driver structure, in which one device driver handles both interfaces as one single unit and hides one interface from the TCP/IP layer.
The development of the swap message as a new message type beside IP and ARP packets.

For instance, Figure 1 shows two separate Ethernet networks. (Of course, the network can connect more than two systems.) If a network hardware problem on one communication path (a broken cable, bad Ethernet controller, faulty hub, or the like) occurs, the other network path can be switched to very quickly, without loss of any data. In such instances, the software chooses an alternate path to the destination with minimal delay and disruption.

Hosts with several network interface devices are called "multihomed." Each network device operates independently and has its own IP address and network mask. Routing tables decide which interface to use to reach the destination. When one interface goes out of service, all hosts on the network are unreachable. Routing information can be dynamically generated and propagated throughout the network to tell each host that a host or network just went out of service. RIP or RIP-2 (see IP Routing Fundamentals, by Mark Sportack, Cisco Press, 1999) and the routed daemon (see TCP/IP Illustrated, Volume I, by Richard Stevens, Addison-Wesley, 1994) are well-known protocols and programs for this task. However, it takes time to notify all hosts of the new situation and such overhead is unnecessary in a LAN environment. My approach eliminates this time lag and, of course, it is designed specifically for the needs of embedded systems.

Ethernet Device-Driver Modifications

To handle both network interfaces, I modified the Ethernet device driver. Because embedded systems have no local disk, the initial program is loaded via bootstrap and FTP, which are programmed into flash memory. Each embedded system receives its program and IP address from a control station, which acts as the boot server and is connected to both networks. It then initializes both Ethernet interfaces for send/receive. However, only one (Device 0, for example) is actually attached to the TCP/IP stack and used for communication. This means only one TCP/IP interface is up and running and only one IP address and network mask is used for both Ethernet devices.

Since all embedded systems in this kind of network are built the same way using core+asic chips, Device 0 always refers to the same network interface; in this case, Network 1. After loading the software and initializing the hardware components, the embedded systems now communicate with each other and the control station over Network 1, the primary network. Network 2 is referred to as the "backup network." As long as there are no network failures on Network 1, the device driver handles Ethernet traffic like any other network device driver.

To detect physical network problems, the device driver monitors the connection between each network interface and the hub (although I used Ethernet 10BaseT with a RJ45 connector, almost any device offers hardware support to query a connection to the hub). If the physical connection goes down, the device driver acts depending on the state of the failing network:

If the failing connection belongs to the backup network, no action is needed. The communication continues to use the primary network.
If the failing connection belongs to the primary network, immediate action is required. Communication continues on the backup network and the embedded system notifies the control station to record the error.

If any device driver detects a broken physical link from its primary network interface device to the hub, the backup network is made the new primary network. The device driver detecting a fault broadcasts a special message, called "swap message," on the backup network. (The device has been initialized at startup time and is fully operational.) The broadcast of the swap message is needed to notify all other device drivers in the network to shut down the primary network and use the backup network from this point forward. The switch takes place immediately after a driver receives such a swap message. The swap message consists of:

A 6-byte Ethernet destination address. Since the message is sent via broadcast, all bits in the field are set to 1.
A 6-byte Ethernet source address. This is the MAC layer address of the sending Ethernet device. This address is unique for each device.
A 2-byte Ethernet type field with the value of 0x0805. Other valid values are 0x0800 for IP packets and 0x0806 for ARP packets.
A 1-byte command field with the value of 1 for SWAP_NETWORK.
A 1-byte field that denotes the Ethernet device number to use from this point on. Since each embedded system has only two built-in Ethernet devices, the value can only be 0 or 1.

The swap message is an OSI Layer 3 protocol. No provisions are made to retransmit or acknowledge swap messages. Retransmission of swap messages is not considered important because the project uses only a small LAN contained in a single segment. Acknowledgment of swap messages is difficult to implement and causes a lot of overhead because the amount of hosts in a network may vary. If a device driver misses a swap message, the host is unreachable.

If the swap message comes in on the primary network, it is quietly discarded. Why? Because this embedded system has already swapped networks. Consequently, any incoming messages on the backup network that are not swap messages are ignored.

The way the physical state of the connection is checked depends on the Ethernet device. Some devices generate an interrupt when the carrier signal drops. Other devices require a periodic check in very short intervals (every 100 milliseconds, for example) of corresponding hardware registers.

The control station can record the network switch and trigger repair actions to replace the faulty part.

While programs use TCP/IP addresses to send data to the destination, the Ethernet devices use unique MAC addresses. The Address Resolution Protocol (ARP) on the device-driver level is used to find out which TCP/IP address belongs to which machine and what that machine's MAC address is. Once a mapping from TCP/IP address to MAC address has been determined, it is stored by TCP/IP software. ARP is a fundamental part of the TCP/IP protocol suite and has been explained by both Richard Stevens and Gary Wright in TCP/IP Illustrated, Volume II (Addison-Wesley, 1995) and Douglas Comer in Internetworking with TCP/IP, Volume I (Prentice Hall, 1986).

After a device driver has either sent or received the swap message, every host uses the alternate (backup) Ethernet device. Therefore, all accumulated TCP/IP-to-MAC address translations are now invalid and the ARP table is completely flushed as soon as such a message is sent or received on the backup network.

The procedure to swap physical networks is:

1. After boot and initialization, each embedded system uses the Device 0 and Network 1. The embedded system's IP address and network mask, assigned at boot phase, are connected to Device 0. Device 1 is initialized and ready to use. Device 0 ignores incoming swap messages and passes incoming packets to the TCP/IP stack or to ARP processing, depending on the packet type. Device 1 ignores all packets except for swap messages; it is the passive or quiet interface (see Figure 2).

2. After some time, the Ethernet device driver on embedded system PPC 1 detects a failing link between its active network device and the hub. It broadcasts a swap message on the passive Ethernet Device 1 and flushes the ARP table. The Device 0 is now dead, the device driver ignores all type of packets coming in on this device and does not send any packet on it.

The Ethernet device driver on embedded system PPC 2 receives the swap message on Device 1. It also flushes the ARP tables, ignores all types of packets coming in on Device 0, and does not send any packet on Device 0; see Figure 3.

3. If a TCP connection was active during the network switch, the sending side (PPC 1, for example) waits for an acknowledgment (ACK) packet. If this packet does not arrive in time, the sender retransmits the data. The device driver now sends this packet over Device 1. As the MAC address of the destination is not available, the ARP table has been flushed, the driver sends an ARP request packet over Network 2 to learn the MAC address of the now-active Ethernet Device 1 on target system PPC 2.

The receiver acts in the same way. It has to acknowledge received data and sends ACK packets. The device driver has to send these packets over a new device, Ethernet Device 1. Because of the flushed ARP table, an ARP request is generated to learn the recipient's MAC address.

If a network switch occurs during an active TCP connection, the additional overhead is one or two ARP requests and response packets, and the time needed to detect the bad link. When a link failure does not trigger an interrupt, the device driver has to query to link status periodically. The intervals must be shorter than the TCP retransmission timeouts. The current value is 100 milliseconds.

Even though UDP and ICMP protocols are by nature unreliable, I make no provisions to prevent these packets from getting lost. It's up to the application to ensure reliability. This is also the case for traditional single network attachments. Swap messages and network switching introduce no new problems.

Performance

To check the performance of this approach, I used the sock program (see TCP/IP Illustrated, Volume I, by Richard Stevens, Addison-Wesley, 1994) to monitor the impact of when network switches occur. I set both the sender and receiver buffers to 16,384 bytes, using the command-line flags -R and -S. During an active TCP connection, data is sent from sender to receiver and the receive side acknowledges the data. Table 1 lists the results in seconds and milliseconds.

Conclusion

Although the communication technique presented here has proven to be highly reliable in embedded networks, further work is still needed to overcome some limitations of the current implementation:

A more robust swap-message protocol. Currently, a missed swap message leads to an unreachable host. As a quick workaround, I send the swap message several times.
Utilize both networks for increased bandwidth and network throughput and fall back to one network in case of failure. However, this requires changes in TCP/IP's ARP handling, as there will be several MAC addresses for one IP address. Since this violated other design goals (not to change TCP/IP stack) it was deferred.

DDJ

1 2 3 4 5 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Embedded Systems