Channels ▼
RSS

C++/CLI: Stack-Based Objects and Tracking References


March, 2005: C++/CLI: Stack-Based Objects and Tracking References

Rex Jaeschke is an independent consultant, author, and seminar leader. He serves as editor of the Standards for C++/CLI, CLI, and C#. Rex can be reached at rex@RexJaeschke.com.


In a previous installment, I wrote, "In this release of C++/CLI, objects of ref class type can reside only on the managed heap or on the stack." However, thus far, all examples have been heap-allocated only. And while this is the only approach available to programmers using "Managed Extensions to C++," C#, J#, and VB .NET, native C++ programmers are used to having stack-based instances of objects as well.

Consider the previous definition of the ref class Point and the following automatic variable definitions:

Point p1, p2(3,4);

Based on our knowledge of native C++, p1 and p2 appear to be stack-based instances of the ref class Point, and for all intents and purposes, they are. p1 is initialized using the default constructor while p2 uses the constructor taking an x- and y-coordinate pair. As implemented, Point is self-contained (that is, it doesn't contain any pointers or handles); however, being an instance of a ref class, it is still under the control of the CLI runtime. So garbage collection occurs as necessary. (Because of this, you cannot define static or global-scope instances of a ref class.)

You cannot apply sizeof to an expression designating an instance of a ref class. (Remember, sizeof is computed at compile time, yet the size of a Point object isn't known until runtime.) On the other hand, you can apply sizeof to a handle since its size is known at compile time.

You cannot define stack-based instances of a CLI array.

Tracking References

Native C++ allows an alias to an object via the & punctuator. For example, for any native class N, you can write:

N n1;
N& n2 = n1; 

A reference must be initialized when it is defined, and for its whole life, it is locked into referring to the same object; that is, its value cannot change. Just as you can find uses for references to native-typed objects, so too can you find uses for references to instances of ref classes. However, you cannot use the same syntax.

Instances of ref classes can move during program execution, so they require tracking. As such, native pointers and references are not sufficient for dealing with them. (Specifically, you can't apply the address-of operator & to an instance of a ref class.) So C++/CLI provides handles and tracking references as their respective equivalents. For example, you can define a tracking reference p3 that tracks the object p2 this way:

Point% p3 = p2;

A tracking reference must have automatic storage duration.

Even though native objects don't move, in the case of n2, % can be used instead of &.

Consider the following:

Point^ hp = gcnew Point(2,5);
Point% p4 = *hp;
Point% p5 = *gcnew Point(2,5);

Here, hp is a handle to a Point, and p4 is an alias for that Point. Even though a handle is not a pointer, we can dereference a handle with the unary * operator. (During the standardization of C++/CLI, there was a discussion of whether a unary ^ operator should be introduced and used here instead. One was not, and the ability to dereference a handle or pointer using * can be valuable when writing templates.) Of course, p4 remains an alias to that same Point even if hp takes on a new value. While there is a handle or tracking reference to an object, it cannot be garbage collected.

In the case of p5, you simply dereference the handle returned by gcnew.

Although handles to almost every ref class type can be dereferenced, handles to two types cannot; those types are System::String and array<T>.

The "Give Me a Handle" Operator

Consider the case in which we want to write the value of p1 to the standard output. The obvious thing to write is:

Console::WriteLine("p1 is {0}", p1);

However, that won't compile, as WriteLine doesn't have an overload that takes a Point per se. You learned in an earlier column that expressions of any value type (such as int, long, or double) can automatically be converted to Object^ by a process called "boxing." And although p1 looks like an instance of a value type, it isn't; it's an instance of a ref class. (I'll look at value type classes in a future installment.) What you need instead is this:

Console::WriteLine("p1 is {0}", %p1);

By using the unary % operator, you create a handle to the object p1. Since every ref class is (ultimately) derived from System::Object, and WriteLine has an overload that takes Object^ as the type of its second argument, the Point^ resulting from %p1 is converted to Object^, and the value of p1 is displayed. No boxing occurs. This operator cannot be applied to an instance of a native class.

GC-Lvalues

The C++ Standard defines and uses the term lvalue. The C++/CLI Standard adds the term gc-lvalue, which is "an expression that refers to an Object on the CLI heap, or to a value member contained within such an Object." A handle points to a gc-lvalue, applying the unary * operator to a handle yields a gc-lvalue, a tracking reference is a gc-lvalue, and %h, where h is a handle, yields a gc-lvalue. (As there is a standard conversion from lvalue to gc-lvalue, a tracking reference can bind to any gc-lvalue or lvalue.)

Copy Constructor

In the following example, p6 is constructed with the given coordinates, while p7 is initialized to be a copy of p6. This requires that Point have a copy constructor; however, the compiler does not give a ref class one of these by default. If one is needed, you must supply it.

Point p6(3,4), p7 = p6;

Here, then, is the copy constructor:

Point(Point% p)
{
   X = p.X;
   Y = p.Y;
}

A copy constructor for a native class N is typically declared as follows:

N(const N& n);

However, for a ref class, % replaces &, and const doesn't fit well into the CLI world.

Assignment Operator

The expression statement:

p7 = p6;

requires an assignment operator, but again, none is supplied automatically. Here is such an operator:

Point% operator=(Point% p)
{
  X = p.X;
  Y = p.Y;
  return *this;
}

The reason there is no default copy constructor or assignment operator has to do with the fact that all ref classes (except System::Object) have a base class, namely, System::Object, and that class does not have a copy constructor or assignment operator. Basically, default versions of each would invoke their base-class counterparts, but none are defined!

Equality Operator

By defining a copy constructor and an assignment operator for Point, you can deal with instances of that type as values; you can initialize them, pass them to functions, and return them from functions. However, there is one more operation you might like to have, comparison, and that operator is quite straightforward:

static bool operator==(Point% p1, Point% p2)
{
   if (p1.GetType() == p2.GetType())
   {
      return (p1.X == p2.X) && (p1.Y == p2.Y);
   }
   return false;
}

Since a tracking reference cannot take on the value nullptr, you don't have to check for that. And because p1 and p2 are aliases for two Points, you use the dot operator to call GetType and the X and Y property getters.

Can We Please Everyone?

In one previous installment, I stated, "For a ref class, equality is implemented via a function called Equals rather than by overloading operator==." In another, I presented an overload for operator== that took handles, and pointed out the problems of using it. Let's revisit these topics.

When designing and implementing a ref class in C++/CLI, the fundamental question to ask is, "Will the users of this type be programming in C++/CLI or some other language (such as C#, J#, or VB.NET), or both?"

C++ programmers are used to manipulating class instances as values, so they'll expect a class to have a copy constructor and an assignment operator. And for some classes, they'll expect equality and inequality operators as well. On the other hand, C#, J#, and VB.NET programmers can only manipulate class instances through handles, so they'll be expecting cloning and Equals functions. (I'll discuss object cloning in a separate column.) They won't know or care much about copy constructors, or assignment and equality operators.

A ref class having an Equals function can be called from any language, even though a C++ programmer would likely prefer operator==. A ref class that does not have such a function will almost certainly result in unexpected behavior if Equals is called on an instance of that class.

A ref class having an operator== function that takes two tracking references will suffice for C++/CLI programmers. While an operator== function that takes two handles could also be supplied, it is less likely to be needed or used by either group of programmers.

Simply stated, you could implement a ref class for one or the other of these two audiences. Now while the CLI applications world could, perhaps, be neatly divided into C++/CLI and "other language" camps, things aren't always that simple. For example, although System::String is a ref class, it provides operator== and operator!= functions that take two handles, yet these compare the values of the strings, not their handles. Basically, value semantics are being used in a ref class, which, in general, is counterintuitive, but for a string class, could be justifiable.

It is clear that "one size does not fit all." To provide the most appropriate interface to users of a ref class, you need to think about their expectations, and that depends on the language they are using. In any event, C++/CLI programmers using ref classes created in other languages will have to live without copy constructors and assignment operators in those classes.

Miscellaneous Issues

Here are a number of short, but useful, topics:

  • You know that const is not a good fit in the CLI world, and C++/CLI does not permit const- (or volatile-) qualified member functions in ref classes. You also know that the keyword this can be used in any instance constructor and member function of such a class. But what exactly is the type of this in these cases? For a native type N, it's N* const. However, for a ref class R, it's simply R^. And although that handle is not const-qualified, its value cannot be changed.

    The implementation of Point::ToString uses:

       return String::Concat("(", X, ",", Y, ")");
    
    

    An alternative to this is:

       return String::Format("({0},{1})", X, Y);
    
    

    As its name suggests, the Format function lets you format the text (using leading spaces or zeros, thousands separators, and so on), not simply concatenate strings.

  • You can see if the compiler supports C++/CLI's extensions by testing if the predefined macro __cplusplus_cli is defined. If it is, it should have the value 200406L.
  • The CLI library contains a type called System::Decimal, which can represent values with at least 28 significant digits. This type is ideal for financial calculations that require a large number of significant digits and no round off. Unlike the floating-point types, Decimal fractional numbers can be represented exactly. When represented in a floating-point type, such numbers are often infinite fractions, making their representations more prone to rounding errors.

    Decimal numbers have a property called "scale," which represents the number of digits to the right of the decimal point. For example, 2.340 has a scale of 3, where trailing zeros are significant. When two decimals are added or subtracted, the scale of the result is the larger of the two scales. For example, 1.0 + 2.000 is 3.000, while 5.0-2.00 is 3.00. When two decimals are multiplied, the scale of the result is the sum of the two scales. For example, 1.0*2.000 is 2.0000. When two decimals are divided, the scale of the result is the scale of the first less the scale of the second. For example, 4.00000/2.000 is 2.00. However, a scale cannot be less than that needed to preserve the correct result. For example, 3.000/2.000, 3.00/2.000, 3.0/2.000, and 3/2 are all 1.5.

    Here's an example of Decimal's use:

       Decimal x = Decimal::Parse("23.00");
       Decimal y = Decimal::Parse("2.000");
       Decimal result = x * y + Decimal::Parse("2.5");
       Console::WriteLine(result);
    
    

    The output produced is 48.50000. Note that C++/CLI does not have a literal of type Decimal, hence the use of the Parse function.

  • Consider the case in which a class written in a language other than C++/CLI provides a public member whose name is a C++/CLI keyword. You can access that member via an intrinsic function having the form __identifier(x), where x is an identifier, a keyword, or a string literal. (However, the string literal form is reserved for implementations.) For example, to call a static function named delete in a class X, and taking no parameters, you'd use X::__identifier(delete)().
  • A literal field is a named compile-time constant defined in a class; as such, it must contain an initializer having a constant value. Although a literal field is used like a static data member, it cannot be declared static. The compiler replaces each use of a literal field with that field's value. A literal field can have any scalar type; however, the only constant values that can be used to initialize a handle are string literals and nullptr.

          literal double PI = 3.1415926;
          literal int MinValue = -10, MaxValue = 10;
          literal int Range = MaxValue - MinValue + 1;
          enum Direction {North, South, East, West};
          literal Direction Home = North;
          literal System::String^ Title = "Annual Report";
      };
    
    

Reader Exercises

Here are some things you might want to do:

  1. Using ildasm, see if, in the Microsoft implementation, a stack-based Point really is allocated on the stack. Compare the code generated with that for a corresponding handle initialized using gcnew.
  2. Using ildasm, look at the metadata signature for Point's copy constructor, operator==, and operator=. You'll see some strange-looking stuff involving modreq and the type IsImplicitlyDereferenced. To understand this a bit more, refer to the draft C++/CLI Standard [1] clause "CLI Libraries." Also, write a ref class R that has two overloads for a function F, defined as void F(F^){} and void F(F%){}. Since these are legitimate overloads, the compiler needs some way to distinguish between them behind the scenes. And that's where optional and required modifiers (modopts and modreqs, respectively) come in.
  3. Given the declarations Point^ hp; and Point p;, make sure you understand the meaning of the following expression statements:

       hp = %p;
       p = *hp;
       hp->X = 2;
       (*hp).X = 2;
       p.X = 2;
       (%p)->X = 2;
    
    

  4. How can you pass a handle to a function and have that function change the location to which that handle points? (Hint: Think pointer to handle.)
  5. Look at String::Format's documentation.
  6. Read the documentation for System::Decimal. In particular, look at the functions with names having the prefix op_. These map to operator functions. For example, op_Addition is called when two Decimal expressions are added using the + symbol.
  7. Both Decimal(23.00) and Decimal::Parse("23.00") result in Decimal objects; however, why are the resulting values different? What's the difference between Decimal(23.00) and Decimal(23)?

References

[1] http://www.plumhall.com/ecma/index.html.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video