Channels ▼
RSS

Security

Signalling Integer Overflows in Java

Source Code Accompanies This Article. Download It Now.


Perspectives

Using the ASM library, COJAC was not really hard to implement. It nevertheless opens some interesting perspectives:

  • It is straightforward to implement a new classloader that performs on-the-fly instrumentation.
  • Java is not the only programming language that produces code for the JVM. Thus, COJAC is expected to be suitable for other languages equipped with a bytecode translator, like Python or Ruby.
  • COJAC could be packaged as an Eclipse plug-in that automatically performs instrumentation on entire projects. Similarly, it would of course be possible to integrate the functionality as a Java compiler option (maybe considered in a next release of Java?).
  • Floating-point operations do not suffer from overflow, but also have pitfalls; sometimes it would be useful to discover that a number did not change after the addition of a nonzero term or the product with a nonone factor. It would be easy to extend COJAC to report such situations.

COJAC is designed to be simple and robust, and this causes some inherent limitations:

  • Compile-time evaluated expressions are not detected. The compiler evaluates expressions such as (3*Integer.MAX_VALUE), and only the result is stored in the bytecode.
  • The instrumentation could potentially be confusing for the debugger and other tools that exploit the correspondence between source code and bytecode (e.g. a profiler).
  • Because we didn't want to add a new class in the instrumented code, the default logging will repeatedly report the same location every time it causes an overflow. It is straightforward to define a better logging callback that filters already-seen locations, as in Example 4.

Our approach of bytecode instrumentation is not the only way of attacking the overflow problem. An idea to reduce the slowdown would be a probabilistic technique that decides at runtime whether an operation has to be checked or not. An alternative would be to write nonportable C code to consult the CPU overflow flag, and bind this code with JNI. Another interesting approach is static analysis: We could, for instance, extend JML tools (like ESCJava) to report at compile-time some of the potential overflows, as is already done for bad array indices. By the way, it can be argued that reporting overflows is a poor goal and that we should aim at completely suppressing the overflow risk, for example, automatically converting int/long to BigInteger objects; this is not trivial because it would cause deep transformations in the code (e.g. converting arrays to ArrayLists).

Finally, a totally different approach would be to rely on virtualization: Is it difficult to adapt QEmu or Xen, for example, so that the virtual machine signals to the host OS any overflow occurring in any running process of the guest OS?

Conclusion

Arithmetic overflow is an annoying feature of most programming languages: On rare occasions, it is exploited by the programmers as a computation property; but most of the time, it is simply a nasty source of bugs. Any attempt to help discovering bugs and deliver robust code is worth trying. That is why Java developers should keep an eye on recent developments. Just to arbitrarily name a few free new tools, look to Eclipse TPTP, JML verifiers (www.cs.ucf.edu/~leavens/JML), delta debugging support (www.st.cs.uni-sb.de/eclipse), the ODB "omniscient debugger" (www.lambdacs.com/debugger)—or COJAC.

What About C/C++ ?

Java and C/C++ differ in their models of arithmetic expression evaluation, but the problem of arithmetic overflow is absolutely similar. Some C/C++ compilers offer an option to generate code that will check integer operations and report overflows. For instance, GCC comes with the -ftrapv flag, but it used to be implemented in a way that does not cover every situation: for instance, it doesn't signal overflows on shorts, and it is known to miss some overflows on int multiplications. Maybe this is why it doesn't seem to be popular. We found another attempt to provide instrumentation in GCC as a separate module using GEM, but the information is a bit laconic about the current state of this project (www.ecsl.cs.sunysb.edu/iop/index.html).


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video