Channels ▼


Detecting Endian Issues with Static Analysis Tools

Detecting Errors Related to Endianness: Syntax and Semantics

Outside of exhaustive unit testing and code inspection, it would be nice if compilers could tell programmers when endian issues are present. But any warnings that a compiler could give on any potential problems would be so noisy that they would most certainly be ignored.

All detection methods require some knowledge of the endianness of the data being processed before any warning can be given. Since compilers are not "smart enough" to know the contents of variables at run time, their ability to detect endian related errors is limited. Detecting endian issues at a code inspection is easier if naming conventions are used to identify big endian or little endian data, but compilers are (usually) not privy to the knowledge of such coding standards.

This is where static analysis tools can be used to detect issues which are applicable to a specific environment. Static analysis tools do not have to behave like compilers which must conform to a language standard, they can report on issues which violate such specific things as potential endian errors.

As a simple example, it is possible to develop a static analysis rule to warn on this situation:

	union {
		short number;
		char  view[2];
	} my_number;

But if the appropriate coding standard was in place, it could be tailored to give no warning for this: (if the tool knew that regex(BigEndian.*) names are big-endian):

	union {
		short number;
		char  BigEndian_view[2];
	} my_number;

Detecting Errors Related to Endianness: Protocol and Paths

The above example with a union is a case of detecting errors by examining syntax and semantics. There are other errors which can be detected when programming protocols are used to correctly handle endianness processing.

In these schemes, proper byte swapping is necessary, and protocols must be followed in order that numeric values are correctly interpreted.

A simple, but typical, example of a protocol: reading from and writing to a network. In this example, the length values must be passed into the function-style macros ntohl() and htonl() before they are used in the calls to data_alloc() and net_write(). Proper byte-swapping is done for input and output information:

	n = net_read (&sock, &netLen, 4);
	/* ... process errors ... */
 	len = ntohl(netLen);.            /* A */
	data_alloc (&packet, len);       /* B */
	len = strlen(ret_string) + 1;
	netLen = htonl(len);             /* C */
	if (net_write (context, &sock, &netLen, 4) != 4) /* D */
	/* ... process errors ... */

A static analysis tool detector could be developed to keep track of a specific functions whose parameters must be processed by the byte-swapping routines. If the byte-swapping routine has not been called for each parameter (or the incorrect one has been called!) an error or warning message could be issued.

The types and complexities of the protocols which can be implemented depend a great deal upon the customization features and abilities of the particular static analysis tool, the usability, and other restrictions which are common across all static analysis tools.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.