Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

The .NET Cost: Who Pays?

, July 01, 2002


July 2002: The .NET Cost: Who Pays?

In the first two articles of this series ("Polygot Programming" and "Achieving Interoperability," Beyond Objects, May and June 2002), I discussed the unprecedented level of language interoperability that .NET provides to developers—that is, if compiler writers have done their job. However, this interoperability has a cost; you'll have to observe a number of constraints:

  • First, you must, obviously, use a language for which a compiler is available for .NET, so the compiler will be able to generate intermediate language (IL) code that follows the .NET object model's rules.

  • Your code may also need to be verifiable to pass the security rules imposed by the .NET security model. This isn't an absolute requirement, and for some languages—especially untyped languages—it will never be met. For others, verifiability may be a compiler option: Upon request, the compiler will generate verifiable code, perhaps by imposing some further requirements on your source texts, and possibly at some performance cost. You should strive for verifiability if you want your code to be used on sites that may not automatically trust you; for example, if it's to be downloaded off the Internet.

  • Finally, if your code must interface with software elements written in other .NET languages, it must comply with the Common Language Specification (CLS).

The Role of the CLS
The CLS is a set of 41 rules designed to ensure harmonious cooperation among .NET languages. It's part of the Common Language Infrastructure (CLI), the specification of the basic .NET architecture standardized by the ECMA standards body (you can find a copy at www.ecma.ch). The rules play two complementary roles:

  • They restrict what you can write; more precisely, what your language's compiler can do with your classes if they're visible to the outside world. Such rules are in addition to the requirements imposed by the .NET object model. For example, rule 40 specifies that any thrown exception must be an object whose type is a descendant of the library class System.Exception, whereas IL would actually let you throw exceptions of other types. If you want your language to be CLS compliant, the compiler must ensure this rule in exported classes.

  • The rules require that you accept foreign code that satisfies appropriate conditions. For example, the just-cited rule 40 also requires that any language provide an exception mechanism able to handle objects of type System.Exception.

In the first of these roles, the CLS specification is negative, as it states what you may not offer to other languages. Its second role is positive: The rules state what you must accept from other languages. There's no real contradiction, because the specification's two roles are complementary: To enable various language models to communicate, the CLS must define the maximum of what a language may export to the rest of the world, and the minimum that it must be able to import from the rest of the world. Compliance is in fact defined at three levels, not just two, because you may "import" a class either by being its client or by inheriting from it (See "Abiding by CLS").

Abiding by CLS
Compliance with the Common Language Specification is defined on three levels.

  1. Framework CLS compliance: Use only those constructs permitted by the CLS rules.
  2. Consumer CLS compliance: Permit use of any framework-compliant class.
  3. Extender CLS compliance: Permit inheritance from any framework-compliant class.

—Bertrand Meyer

A software product is framework compliant if the compiler-generated code that it exposes to the rest of the world observes the CLS rules. For example, it should never trigger any exceptions other than those of type System.Exception or a descendant. A language implementation is consumer compliant if it enables its classes to "consume" (be a client of) framework-compliant classes. It is extender compliant if it enables its classes to "extend" framework-compliant classes; that is to say, inherit from them with all the associated privileges: adding new features and redefining (overriding) inherited features. Extender compliance is pretty taxing, since it means that you must find a way to support or at least emulate all framework-compliant language mechanisms, including some that might be quite far from your language's core concepts.

Some of the "negative" rules actually facilitate the task of consumers and extenders. For example, rule 10 dictates that while overriding (redefining) an inherited routine, you may not change its export status, even though the basic object model allows it. This protects extender languages, since it means that they aren't required to provide a mechanism for changing the export status of inherited routines.

CLS Compliance in Practice
Some of the CLS rules can appear scary at first if you rely on a language model that varies significantly from the Java-C#-VB.NET family. To keep them in perspective, remember the following caveats:

  • CLS compliance matters only for software elements that you wish to export to modules written in other languages, or import from them. So as long as you're talking to your own friends on your own team, you can indulge in whatever pleasures you've enjoyed in the past. It's only when other teams join the game that you must start thinking about maintaining proper appearance.

  • Even then, it's still only about appearance. What you really do is between you and your conscience; CLS compliance matters only for what you reveal to the rest of the world. As long as the view you present is CLS compliant, it's no one's business that it might serve as a cover to non-CLS politically correct games. Don't ask, don't tell.

Let's explore these two points further, if only to reduce the risk of a heart attack when you encounter some of the actual rules. The first is important because of the practical nature of multilanguage applications. It's not realistic, in an application containing C++, C#, Eiffel and Cobol elements, to expect them all to talk to elements written in one or more of the other languages. A project is multilanguage because it consists of a number of subprojects, each written in a particular language; in practice, each subproject will usually include a few bridge modules that talk to other languages. CLS compliance affects only these bridge modules; typically a small subset of the software. Everywhere else, each subproject can behave as if it were a single language.

The other main source of multilanguage combination, cited at the beginning of this article, is the use of libraries from another language. The visible classes of such libraries should be framework compliant; classes that use them will have to be consumer compliant and, for the usually small subset that needs to inherit from library classes, extender compliant.

The complementary observation concerns appearance. Even in a bridge module, you can often pursue CLS-deviant practices as long as you present them, for outside consumption, in CLS-compliant clothing. Here's an example: Rule 16 specifies that CLS arrays start their indexing at zero. Counting from zero is part of the sad legacy of C. (How many fingers on your right hand? Count with me: 0, 1, 2, 3, 4!) What if your language indexes arrays from 1, as in Fortran, or lets you specify arbitrary bounds, as in Eiffel? Well, all that really counts is to pretend to the rest of the world that your array is a good CLS citizen. So if you have a Fortran array indexed from 1 to 100, or an Eiffel array from 1,901 to 2,000 (perhaps to keep information associated with years of the twentieth century), your CLS-aware compiler will export it to the rest of the world as if it were a solid 0-to-99 citizen.

So don't despair when you face a seemingly unacceptable rule; remember that it's only for bridge classes, and only for show. It may look like you have to adopt the state religion, but in practice you may get away with a few genuflections in the right public places.

You and Me: Enforcing the CLS
I've been freely relying on the term you, as in you must use only exceptions of a type conforming to System.Exception. But, who is you: the application programmer? The language designer? The compiler?

You can be any or all of these:

  • To guarantee extender compliance, the language designer must ensure that the language has mechanisms to emulate all the constructs supported by the CLS.

  • To enable the production of framework-compliant code, the compiler writer must provide the corresponding compilation option.

  • To produce CLS-compliant subsystems or libraries, the application programmer must stay away from non-CLS constructs when writing the bridge classes intended to interact with other languages.

The last two points go together. A programming language will, almost inevitably, offer mechanisms that aren't CLS compliant. Even in C#, whose semantics is closest to the .NET object model and the CLS, you can easily write non-CLS-compliant code; for example, by using non-CLS-compliant types such as native unsigned int. If, as a programmer, you want to be sure that certain classes are CLS compliant, you'll ask the compiler to generate CLS-compliant code through a compiler option. For a class flagged with this option, the compiler might:

  • Reject it if the class uses language constructs that the compiler can't map to CLS-compliant .NET constructs.

  • Accept the class, but for certain constructs generate different code (for example, less efficient) than that which it generates when CLS compliance isn't required.

Instead of a compiler option, the current C# compiler produces warnings on request. The Eiffel compiler provides a compiler option that doesn't reject any construct, but may, in some cases, generate different code.

If you've requested the generation of CLS-compliant code, it will be marked as such, at the class or assembly level. This is achieved through a custom attribute, System.CLSCompliantAttribute, whose constructor takes a boolean argument set to True, signifying compliance, as per rule 2: "Members of non-CLS-compliant [classes] shall not be marked CLS compliant." The attribute can be set for a class or for an entire assembly; then individual classes may override the value set for the assembly.

If, in a consumer or extender language, you write a class that's a client of a foreign class or inherits from it, the compiler for your language will check that the System.CLSCompliantAttribute is set to True for the foreign class.

Life With the CLS
Rule 5 states that all the names used in a class must be "distinct," except where the names are identical and "resolved via overloading." Overloading, the most masochistic device ever introduced, means that you can give the same name to several methods as long as they differ by at least one argument type. This is a rare example of a facility that has no known advantage, and many documented problems (it's confusing, and conflicts with object-oriented mechanisms such as polymorphism and redefinition). Nevertheless, due to the influence of C++ and Java, it's in the .NET object model and, worse, in the CLS.

Should you worry about overloading if your language doesn't have it, and you want to be CLS compliant? The answer seems to be no, as you can just ignore this mechanism, like a computer's machine instruction that a compiler never uses for its generated code. But this covers only framework compliance. To be consumer compliant, and especially extender compliant, you do have to care. Extender compliance means that a class in your language must be able to inherit from a class that includes overloaded methods; for example, a C# class with the methods write (string s), write (int n) and write (float r, string format).

In the original, they are each called write, but in your own class, this would cause an error, since you can't have overloading.

At first, this case sounds like a show-stopper until you realize that if you must be able to inherit overloaded methods, nothing forces you to inherit them under their original names. That would be an impossible requirement to enforce anyway, because various languages have different naming conventions: Visual Basic, as we noted in the first article, allows hyphens in identifiers, which most other languages reject; C lets you start an identifier with an underscore, but this is not universal. So what overloading means in practice is that compilers for consumer and extender languages have to provide a mechanism for "demangling" (name ambiguity resolution), to make sure that names that were mangled in the original look different to their users. For example, the above methods could be renamed using a simple demangling algorithm based on the types, into write_string, write_int and write_float_string.

The Eiffel team has proposed such an algorithm as an informal standard, to avoid every non-overloaded language reinventing its own conventions. The only negative remaining consequence of overloading is that consumers and extenders of overloaded languages will need some extra documentation, since the original documentation includes overloaded names that are not directly usable.

The support for overloading in the CLS is a design flaw. Even if it had any conceptual justification, overloading would still be a language concept, and one that concerns not the deeper semantic properties of a language, but the external appearance of software texts—a mere facility for the program writer. It has no place in a general OO model; even less in a scheme like the CLS, whose very purpose is to enable many languages to collaborate.

Overloading languages should never have been permitted to pollute the common conceptual setup with a marginal mechanism that complicates everyone's job. Instead, overloading should have been explicitly removed from the CLS, putting the onus on the overloading languages to provide a demangling algorithm to clean up any mangled names.

At least there's a fairly easy way out, as in the case of arrays. Some rules are harder to deal with for certain languages. Rule 19, which prohibits interfaces from including static methods, complicates the emulation of multiple inheritance in a CLS-compliant way; it's all the more regrettable that the rule limits the usefulness of interfaces and goes against the principles of object technology.

Other problems arise with rules 21 and 22, which impose creation policies coming from C++ that are hard to justify through rational arguments: the need for constructors of a class to call constructors of parents and the impossibility of using a constructor to reinitialize an object. (Since it's often necessary to reinitialize objects, in practice this rule creates the need to duplicate methods and their code, always a major obstacle to good software practices.)

Here, too, languages with different models can find ways to cheat, but these two rules cause more nuisance to their users. They cause the most damage by following the Java tradition of trying to impose a single language model on everyone. This danger is ever-lurking in the .NET object model, and should be fought with the utmost energy, since it threatens the whole purpose of the technology. Fortunately, most of the CLS rules are reasonable and don't cause any major trouble for other .NET languages.

Toward a Language Renaissance
The CLS completes the careful and innovative design of multilanguage support under .NET. Only people who haven't looked carefully enough can push the "common denominator" view (the assertion that it's all a single language, anyway). That's plain wrong.

.NET provides the ability to map languages into a common model and hence obtain interoperability, while preserving the originality and independence of each language. This exciting architecture holds the potential of a programming language renaissance, enabling languages to compete on merit, not political prejudice, and the field to blossom as never before.

 


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.