Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Embedded Systems

Detecting Out-of-Range References


JUN93: DETECTING OUT-OF-RANGE REFERENCES

Protecting yourself from memory-access violations

Chan is a member of the technical staff in the Am29000 Systems Engineering Embedded Processor Division for Advanced Micro Devices. He can be contacted at [email protected].


An out-of-range reference is an access attempt to a location outside the permitted memory range. Several conditions can cause this runtime error: an out-of-bound array index, uninitialized pointer variables, or dynamic storage overflows, for example. Out-of-range errors occur more frequently, as more program variables are bound at run time to facilitate efficient memory utilization. Growing usage of indirect procedure calls in object-oriented programming is another reason for this trend. Since there's no way to detect an invalid binding of a pointer variable at compile time, out-of-range references are becoming a major source of runtime errors in embedded-systems programming.

Out-of-range errors are not a problem in conventional computing systems. Typically, a protection mechanism verifies every memory reference. Out-of-range references are detected and reported, thereby preventing system crashes. With UNIX systems, for example, the familiar messages, "Bus error--core dumped" and "Segment violation--core dumped" are generated upon an invalid access attempt. A memory-management unit or boundary-checking logic is responsible for this access protection and is crucial for maintaining the conventional system's integrity.

Out-of-range References in Embedded Systems

Unlike conventional computing systems, most embedded systems are unprotected from access violations. Protection mechanisms have not been a critical requirement for embedded systems because embedded programs are thoroughly debugged and frozen before product release. These "error-free" programs eliminate the expensive production cost of extra protection hardware.

The basis of the embedded-systems design philosophy requires a thorough debugging strategy that eliminates run-time errors before system release. In the absence of built-in protection, embedded-systems developers use emulators and logic analyzers to handle out-of-range reference errors. This debugging process involves repeated attempts at setting the debug equipment triggers and rerunning the target application. These attempts are time consuming and often result in poor productivity. Out-of-range references are more frequent with dynamic storage utilization and certain kinds of object-oriented programming where indirect references are used extensively. Because of this time increase, the debug-equipment based technique is no longer efficient for supporting error-free programming strategy, especially on tightly scheduled projects. Equipment cost is also a problem. Although stand-alone debug equipment is essential in a development workbench, this equipment is often too expensive to justify acquisition for a moderately funded project. Additionally, the personality module for a newly introduced microprocessor may not be available.

When an access violation occurs in unprotected embedded systems, the outcome might appear in a variety of ways, often confusing the developer. Access-violation results range from no change in behavior to catastrophic system failure, depending on whether the violating address is within the physical-address boundary. The outcome also depends on the difference in the memory-interface mechanism. The access error may not impact the system at all, if the violating address points to a free memory location within the physical boundary. For systems with a memory interface based on a fixed number of wait cycles, the CPU accepts the instruction/data bus contents as long as the address is not rejected by the system. In this case, the system gets garbage on the bus, but the violating access may not cause an immediate crash.

However, the garbage may crash the system at a later stage, and this delayed crash will take longer to solve. In both crash situations, the lack of a built-in protection/detection mechanism will necessitate using debug equipment to determine where the system crashed and what kind of error condition caused the violating address. The address may be the direct result of either an uninitialized pointer or a sequence of previous out-of-range references. Out-of-range errors result in one of three typical outcomes: system crash, illegal opcode or unexpected control flow, or no observable event.

System crash may occur if an access request cannot be granted or the violating address cannot be decoded by the CPU. In the first situation, a memory-access request is on hold if the memory interface cannot decode the address. After issuing a request, the CPU waits for a Ready signal. If the memory interface does not respond to the invalid access request, then the CPU can no longer proceed, resulting in virtual deadlock. This type of error is the easiest to track because the violating address can be observed immediately using a logic analyzer. Ideally, the memory interface should signal the CPU for the address exception, thereby preventing the CPU from waiting indefinitely for the Ready signal.

In the second situation, the system will actually crash because either the CPU cannot decode an out-of-range address, or the invalid address corrupted a memory-mapped control register. For example, the Am29200 CPU partitions a whole address space into a set of sub-spaces for different memory classes such as ROM, RAM, and memory-mapped peripherals. A violating address will crash this type of CPU if the address does not belong to any of the memory classes.

Illegal opcode traps and unexpected control-flow problems occur when an indirect jump or a procedure call is taken with an uninitialized or corrupted instruction address. Illegal opcode traps result when the invalid address targets a data-memory location and the contents of the incoming instruction bus cannot be decoded as a valid instruction. Since most embedded CPUs are capable of trapping illegal opcodes, this error is easily detected. However, discovering the problem's origin may be time consuming if the invalid instruction address is not directly generated from a program variable.

Unexpected control-flow problems result from an invalid address pointing to an incorrect instruction memory location. After a jump or a procedure call, the system loses control because the program execution continues from the incorrect location. This problem may be more difficult to resolve, as the system behavior can only be analyzed by examining a long sequence of instruction traces.

The result of an out-of-range data reference is unnoticeable if the reference accidentally accesses a free memory location without corrupting other allocated memory objects. As long as the accessed memory location remains free, this particular out-of-range reference is completely hidden-even from a production test suite. Although this reference sounds harmless, the error makes the system unusable when the accidentally accessed location is no longer free. For example, a data-pointer variable happens to go beyond the boundary of heap space and grab a free location. In test cases not requiring the free memory location, this error is never detected. However, when the operation of the system claims the free memory location at run time, the system suddenly stops working. Because this error may necessitate a costly software revision after product release, non-observable errors may eventually be huge problems.

A Software-driven Detection Scheme

One way of coping with these problems is to develop a software-alternative scheme comparable to the conventional system-hardware protection mechanism that verifies address access. One approach to the software scheme is utilizing an instruction-trace facility, if available, and letting the trace-trap handler verify both the instruction and data address for every instruction before execution. The trace-trap handler would first compare the contents of program counter with the current code segment's boundary address. If the program counter is within the boundary, then the handler fetches and decodes the instruction to see if it involves any data access. If so, the handler then verifies the data address against the allocated data boundary. This straightforward scheme guarantees the detection of any out-of-range reference at a cost of extra instruction cycles for executing the instruction trace-trap handler. The cycles consumed by the handler significantly increase execution time, since the trace trap is performed for every instruction to be executed. This inherent overhead of the scheme is acceptable for testing most embedded applications except certain real-time applications with strict timing requirements. To keep the detection overhead as small as possible, it's preferable for the scheme to use CPUs consuming few extra cycles before entering the trap routine.

Listing One (page 128) shows an Am29200 RISC microcontroller-based, out-of-range, detection-trap routine. This specific implementation executes about 50 instructions to verify both the instruction, stack, and RAM data address. If the verified instruction is not a LOAD or STORE, the detection routine consumes only 25 instruction cycles. Since the Am29200's trap latency is only a few cycles, verifying every instruction and data access for a particular program takes less than 50 times of the total application execution time. This increased execution time in testing is generally acceptable for most embedded applications. For real-time applications requiring system response within a few cycles, this detection scheme may not be directly applicable. The scheme, however, may still be utilized with simulated interfaces to real-time devices to maintain the system's normal course of operation during the testing stage.

The instruction-trace trap-based scheme provides embedded systems with a detection mechanism comparable to the hardware counterpart, yet it doesn't add any production cost. The additional time allowed for extra instruction cycles in the verification stage is the only cost for developers. Software-driven detection is also inherently flexible, both in its setup and operation mode. It may choose to verify the instruction address only, or it can check every external reference, including stack, heap storage, and memory-mapped peripheral devices. The detection scheme may also be enabled for necessary access verification and then later disabled. The scheme can be enabled only for a portion of program under development, rather than attempting verification of the whole program. Additionally, the software-driven scheme allows on-the-fly update of the boundary address. This address may be constantly changing, depending on dynamic memory allocation and deallocation.

To effectively implement the software-driven detection scheme, the target CPU must support these functions: instruction-trace facility, a very short trap latency, and on-chip storage for the boundary address. These features are mostly required to shorten the detection overhead, thus reducing the access-verification time. Note that they not only benefit the detection scheme but also help in building debug monitors and improving CPU performance. The Am29200, for example, supports these features to allow fast context switching as well as implementing the software-driven detection scheme.

An on-chip instruction-trace facility is necessary because the tracing event activates the verification routine after executing every instruction. For CPUs without the tracing capability, a timer can be used to activate the verification routine at regular intervals. This timer-based scheme is only able to verify the particular instruction executed at the time of the interrupt. However, with reduced timer-interrupt intervals, this alternative approach may still provide a good detection coverage on out-of-range references.

The trap latency is important for the detection scheme since the delaying period may greatly impact the overhead. Depending on the target CPU architecture, the latency between an instruction-trace event and the start of the trace-trap handler may range from a few cycles to hundreds of cycles. The cycles are required to read the trap vector, jump to the handler, and save the current context in memory for most CPUs. Because saving context uses many memory cycles and increases the overhead of the scheme, it should be avoided if the CPU architecture allows. For example, the Am29200 doesn't need context saving because the critical registers composing the context are frozen upon a trap. The frozen context will not be disturbed by the detection routine shown in Listing One. By not saving context, the overhead of the scheme is greatly reduced. In Listing One, an average of less than 50 instructions are executed to verify both the instruction and data addresses.

If the boundary address is stored in memory, the detection routine must spend many memory cycles to access the boundary information. Since the routine accesses at least four boundary addresses, the speed of the boundary address storage greatly impacts the time spent by the detection routine. The boundary address is better kept in registers to avoid memory access. The Am29200 provides register storage for keeping boundary address, thereby reducing the time spent by the detection routine. In Listing One, gr125/gr127 (general-purpose registers 125 and 127) are allocated for stack boundary checking, gr86 for heap pointer, and gr114/ gr115 for instruction boundary checking. The detection routine makes only one memory access to fetch instructions from memory, to see if it is a LOAD or STORE instruction. The boundary address registers also help the boundary updating process, because they don't use extra memory cycles to update the boundary.

Implementation Issues

In addition to the CPU requirements, the following implementation-related issues need to be carefully addressed in order to effectively utilize the software-driven detection scheme.

Before starting the verification process, the boundary address must be correctly initialized and updated to reflect any boundary changes. To forward the boundary information, a set of procedure calls is the most convenient way to support the functionality. Some boundary information may be automatically initialized and updated. In an Am29200-based environment, the stack and heap boundaries are managed by a C runtime kernel that adjusts tops of regions as programs execute. The boundaries of the instruction and static data regions of the Am29200, however, need to be initialized.

Fragmented memory in a given region may require multiple verifications against multiple sets of boundaries to ensure that an access address is not out of range. Because of this requirement, the target object code should be linked in single, unfragmented regions.

To support multitasking environments, the verification process needs to manage a table that stores the boundary information for all the active tasks. When a task switching occurs, the operating system must inform the verification process of the new task's identification. The verification process then updates the boundary register. Depending on the system architecture, task switching may automatically update boundary registers accessed by the verification process. No management of boundary information is necessary for this case.

Since an out-of-range reference error is normally unrecoverable, the only choice is to abort the execution and report the exception to the developer for troubleshooting. The error information may include the out-of-range address, the address of the violating instruction, and possibly a dump of the register contents. The simplest way to report an out-of-range error is to provide the application program with an error-handling routine.

In Listing One, the violating address and instruction, together with the memory region type, are transferred to the error-handling routine Report_Out_Range. The routine then reports the error information. This routine can either be written in assembly or C, but note that an adequate C runtime stack environment must be provided when the error is reported by a C routine. If supported, a more convenient way of error reporting is to use the SIGNAL facility to take advantage of the standard signals such as SIGBUS or SIGSEGV.



_DETECTING OUT-OF-RANGE REFERENCES_
by Chan Y. Lee


[LISTING ONE]
<a name="016a_0008">

; TraceTrap - detects out-of-range references
; This instruction trace trap handler verifies both the instruction and data
; address to detect an out-of-range reference. It first checks the contents of
; 'pc0', which keeps the address of the instruction to be executed when this
; trap is returned. If 'pc0' is good, it then fetches instruction from memory
; and decodes it to see if the instruction makes any memory reference. If so,
; the handler then checks the data address. When it finds an access is
; out-of-range, it invokes a user procedure provided to properly report error.

     .global   TraceTrap
TraceTrap:
     ; Check for out-of-range instruction address. The 'pc0' contains address
     ; of the instruction to be fetched. Instruction address is verified
     ; against both ROM and RAM instruction boundaries. ROM boundary checking
     ; can be omitted if code under verification doesn't reside in ROM.

     mfsr treg1, pc0        ; obtain instruction address from program counter 0
     srl  treg2, treg1, 28  ; check address qualifier first
     cpeq treg3, treg2, 0   ; see if it is an instruction ROM access

     jmpt treg3, CheckInstROM
     cpeq treg3, treg2, 4   ; see if it is an instruction RAM access
     jmpf treg3, InstOutRange ; error if neither ROM nor RAM qualified
CheckInstRAM:
     cpge treg2, treg1, RAM_pc_low
     jmpf treg2, InstOutRange
     cple treg2, treg1, RAM_pc_high
     jmpf treg2, InstOutRange
     nop
     jmp  CheckRegStack
CheckInstROM:
     cpge treg2, treg1, ROM_pc_low
     jmpf treg2, InstOutRange
     cple treg2, treg1, ROM_pc_high
     jmpf treg2, InstOutRange
CheckRegStack:
     ; check for out-of-range register stack pointer(gr1).
     cpge treg2, gr1, gr126             ; gr126 is the Register Allocate Bound
     jmpf treg2, StackOutRange
     cple treg2, gr1, gr127             ; gr127 is the Register Free Bound
     jmpf treg2, StackOutRange

     ; Verification of instruction counter and reg stack pointer is done.
     ; Now fetch the instruction and decode it to see if it is a LOAD or STORE.
     ; If so, check data address contained in register operand of instruction.
     load 0, 0, treg2, treg1            ; fetch the instruction from memory
     srl  treg3, treg2, 24
     cpeq treg3, treg3, 0x17            ; look for LOAD
     jmpt treg3, CheckDataAddress
     srl  treg3, treg2, 24
     cpeq treg3, treg3, 0x1E            ; look for STORE
     jmpf treg3, TrapExit
CheckDataAddress:
     sll  treg2, treg2, 24    ; to reset all of the bits except register number
     jmpt treg2, LocalAddressReg  ; if MSB is set, a local register has address
     srl  treg2, treg2, 24        ; now treg2 has the reg number
     const     treg3, GlobalRegTable
     jmp  GetDataAddress
     consth    treg3, GlobalRegTable
LocalAddressReg:
     andn treg2, treg2, 0x80       ; reset the MSB of local register number
     const     treg3, LocalRegTable
     consth    treg3, LocalRegTable
GetDataAddress:
     srl  treg2, treg2, 3          ; table offset from reg number
     add  treg2, treg2, treg3      ; jump location to get address
     const     lr0,   DataAddressObtained
     jmpi treg2
     consth    lr0,   DataAddressObtained
DataAddressObtained:
     ; 'treg2' now has the data address for either STORE or LOAD instruction.
     ; Verify the data address against the following regions:
     ;    1. Data RAM                        (0x4XXXXXXX)
     ;         This includes the stack, heap and static data.
     ;    2. Peripheral Control Registers    (0x800000XX)
     ;         Access to the on-chip Peripheral Control Register.
     ;    3. Peripheral Interface Adaptor5   (0x950000XX)
     ;         Access to the PIA5 region ranging from 0x95000000 to 0x950000FF.
     srl  treg3, treg2, 28              ; to get the address qualifier
     cpeq treg3, treg3, 0x4             ; data RAM address?
     jmpt treg3, CheckDataRAM
     srl  treg3, treg2, 28
     cpeq treg3, treg3, 0x8             ; peripheral control register?
     jmpt treg3, CheckPCR
     srl  treg3, treg2, 24
     cpeq treg3, treg3, 0x95            ; peripheral adaptor region 5?
     jmpt treg3, CheckPIA5
     nop
     jmp  DataOutRange     ; report data address error if the qualifier doesn't
                           ; match with any Am29200 region currently defined
CheckDataRAM:
     cpgt treg3, treg2, RAM_pc_high     ; check base of RAM data
     jmpf treg3, DataOutRange
     cple treg3, treg2, gr86            ; check top of RAM data(heap ptr:gr86)
     jmpf treg3, DataOutRange
     nop
     jmp  TrapExit                      ; Data RAM address is good
CheckPCR:
     const     treg3, 0x800000FC
     consth    treg3, 0x800000FC
     cpgt treg3, treg2, treg3           ; valid range: 0x80000000 to 0x800000FC
     jmpt treg3, DataOutRange
     nop
     jmp  TrapExit                      ; Peripheral Control Register is good
CheckPIA5:
     const     treg3, 0x950000FF
     consth    treg3, 0x950000FF
     cpgt treg3, treg2, treg3           ; valid range: 0x95000000 to 0x950000FF
     jmpt treg3, DataOutRange
; TrapExit - Address verification is completed. Return to the application.
TrapExit:
     ; Clear TP(Trace Pending) bit from OPS(Old Processor Status) register
     mfsr treg3, ops
     const     treg2, TP
     andn treg3, treg3, treg2
     mtsr ops, treg3
     iret











Copyright © 1993, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.