Channels ▼

Open Source

History and Goals of Modula-2


The language was defined in 1978 (10 years after Pascal) and implemented by L. Geissmann. S.E. Knudsen, and C. Jacobi on the PDP-11. Its small store of available memory (28,000 16-bit words or 56 K bytes) caused many obstacles and was the reason for the first compiler's five-pass structure. In the summer of 1979, the compiler was completed, and at the same time, the first Lilith prototype became operational. The operating system Medos-2 had been constructed by S.E. Knudsen concurrently, and together with the compiler it was transported from the PDP-II to Lilith within three weeks. This was in itself an encouraging feat in software engineering and proof of Modula-2's usefulness as a system-implementation language. After a year's in-house use, I published the report on Modula-2 and we released the Modula-2 compiler to interested parties (March 1980). The defining report is included in the tutorial book Programming in Modula-2 (N. Wirth, New York: Springer-Verlag, 1982) which I prepared up to the camera-ready stage with the aid of Lilith and a document formatting system that I programmed, of course, in Modula-2.

As mentioned already, Modula-2 grew out of Pascat and incorporates a few major and a fair number of minor im provements. The single most outstanding added facility is the module structure. Basically, this facility allows you to partition programs into units with relatively well-defined interfaces. More specifically, it allows you to control the visibility of declared objects and to hide them from places where they better remain unknown. Since it plays such an important role, we shall now look briefly into the history of Modula-2's development.

Although the principle of information hiding was much discussed in the early 1975, it is perhaps Modula's merit to place it consistently into the framework of a clearly defined language. To the best of my knowledge, it was David Parnas who first coined the expression "information hiding." Both Tony Hoare and Per Brinch Hansen gave it form by connecting it to the class facility of the programming language Simula. This is essentially, in terms of Pascal. a record type; the set of instantiations of this type forms what Simula terminology calls a class. In contrast to Pascal records, the Simula class allows you to associate procedures with the data represented by the field identifiers. Hoare and Brinch Hansen then postulated that while the names of the procedures would normally be visible, those of the data would remain hidden except within the associated procedures. This feature was implemented in Brinch Hansen's Concurrent Pascal and embodied the principle of information hiding, which in the meantime (1975) had been promoted to that of data abstraction. The example in Listing 1 serves to exemplify the issue: we use a liberal Pascal notation ("N" is a constant).

TYPE queue =
   (*n = no. of filled slots: Initially 0*)
   buffer: ARRAY [0..N-1] OF REAL;

   BEGIN IF n = N THEN HALT (*full*); 
      bufferllnl;=x; In:=(In + 1)MOD N;n:=n+1
   BEGIN If N = 0 THEN HALT (*empty*)
      x:=buffer[out]; out; =(out + 1) MOD N:n;= n-1

Listing 1: Sample queue routine in Pascal

In a program using this class (type), one might declare variables:

q0,q1: queue

and thereafter access them by statements like:

q0.put(13.7) q1,get(v)

but statements like "q0.n := 237" or " := q0.out;" which evidently interfere with the presented implementation of the abstraction of a queue, would be disallowed, More importantly, they would be prevented by the compiler, which does not "see" the field identifiers "n;" "in;" and "out" that are hidden inside the class (record) declaration,

Unfortunately, these proposals intertwined several independent concepts like visibility, instantiation, indirection of access, concurrency, and mutua] exclusion. Both authors had actually postulated these kinds of classes to embody areas of mutual exclusion in multiprocessing systems, and the facility became more widely known as a monitor (in which the HALT statements are replaced by synchronization operations). In the development of the experimental multiprogramming language Modula-1, I strived for clarity of concept and was convinced that a substantial disentangling of the various intertwined concepts was mandatory. Together with H. Sandmayr we found a possible solution in the structure then called module that would concern the aspect of visibility only. This, I believe, was the major breakthrough. because in all other languages the visibility issue had always remained intimately connected with that of existence. In particular. it was now possible to declare sets of static, global objects that were visible from selected parts of the program only. This is typically desirable to encapsulate cer tain permanent parts of a system (like device drivers. storage allocators. window handlers. etc.). I enhanced this module facility with so-called import and export lists that allow the explicit control of visibility across the "module wall" of each individual object. Furthermore, true to Algol and Pascal tradition. modules can be nested. An export now signifies the extension of visibility to the outside. and import signifies its extension to the inside (see Figure 1).

Figure 1: Crucial to the structure of Modula-2 is the concept of the module, which may be nested inside other modules. Within each module. The visibility of objects to other modules may be controlled via the IMPORT and EXPORT functions.

Separation of Specification and Implementation

During my aforementioned stay at PARC in 1976, Ibecame acquainted with the language Mesa, a Pascal offspring specifically designed to meet the needs of large system development. Mesa also incorporated an information-hiding feature. It also allowed you to encapsulate program parts into modules but lacked the ability to control the visibility of individual objects, and it entangled the facility once again with another feature, namely that of separate compilation. a facilitY of implementation. Its noteworthy contribution was the separation of the declarations of the exported objects from that of the ones to remain hidden. The former is called the "definition part," the latter is the implementation part, which contains all those details that are relevant to the realization of the exported mechanisms, but not to their functional definition.

The combination of Mesa's module facility with split definition and implementation parts, and the (nestable) Modula-1 modules with controllable import and export resulted in Modula-2.

The facility of separate compilation posed some nontrivial problems. The key idea is that the compilation of a definition part results in a (compiled) symbol table (represented as a file). The file contains all information relevant to importers (clients) of that module. If, at a later time. another module imports objects from, say, modules M1 and M2, then the compiler accesses the previously generated symbol files of M1 and M2. Thus, the rules of type consistency are observed across module boundaries as well. This makes separate compilation genuinely helpful and safe, in contrast to independent compilation as I known from assemblers and FORTRAN compilers, which is a misleading pitfall when used with high-level, data-typed languages.

It is noteworthy that Ada incorporates this form of module in almost identical form, although under the name package. We can attest that this is one of the more essential features of any system language because we have made extensive use of it for the last five years. Regrettably, Ada designers have failed to restrict separate compilability to global modules.

Using Modules Effectiviely

During the last five years, we have seen that postulating and providing a new facility is one thing, and learning to make good use of it is another. The more intricate and sophisticated a fadlity is, the smaller is the chance that it will be used wisely. In fact, finding the appropriate structure for the data and program is the key to successful programming. With the module we have added another level of granularity in program structuring. The difficulties of finding a good partitioning-I carefully void the word "optimal"--are cumuated at this level, because often the mdules are the units that are contructed by different programmers. Their contracts, in fact, are the definiion parts of their modules. The definition parts establish the interfaces, which onstitute the first task in a system's design process. Lucky are those who hit good solution at the outset, for any change affects all participants. If a definition module A has been changed in any way, then all modules that import must be adapted (and at least be rempiled). This is not the case, however, only the implementation part of A had been modified. Hence, a fair degree of decoupling is established. Extreme exmples are the primary utility modules of an operating system, because they're used by virtually every program. The system may well be modified withut hampering the users. However, the slightest change in a definition module will require the recompilation of all clients.

The first rule to be observed when you deal with modules is that the interfaces must be considered before implementations are attempted. The terser they are, the smaller the chance for mistakes and the need for changes. Interfaces should, by their very definition, be "thin."

A second observation is that a module usually hides a set of data and provides a set of operators to manipulate this data. By forcing the client to access this data via the offered procedures, the module's designer may guarantee that certain consistency conditions are always observed, i.e., always remain invariant. In the queue model shown in Listing 1, it is guaranteed that the counter n truly reflects the number of elements contained in the buffer and that their order of coming out of the queue is the same as that of going in.

As a consequence, a module is typically chosen as the collection of routines that operate on a set of data, which can be seen by the client as an abstraction defined by the accessible set of procedures.

Often, a module is also chosen as the collection of procedures that constitute a level of abstraction of data that is residing elsewhere. For example, a module containing a set of input and output routines such as:

Readlnteger(f,x) and WriteReal(f,x)

will allow you to think in terms of the abstract concept of a sequence of integers and real numbers, and to ignore the details of its implementation in terms of bits, bytes, buffers, files, disk sectors, etc.

Consequently, such a module is chosen in order to establish a new level of abstraction. The success of such an abstraction crucially depends on its rigorous definition and your willingness to genuinely ignore its implementation. Please don't misunderstand! I do not say to remain ignorant of its implementations, but rather only use an implementation's properties that are defined in terms of the abstraction. To cite a well-known example: if you think in terms of integers, it does not make sense to ask for the value of an integer's last bit, even if you know that it is represented as a sequence of bits. Instead you should ask whether the integer is odd.

Computers, Languages, and Commercialism

It is precisely the ability to think in terms of proper abstractions that is the hallmark of a competent programmer. Even more, he or she is expected to be able to jump from one level to another without mixing them up. A structured language is enormously helpful in this endeavor, but it does not do it for you. It is like with a horse: you may guide it to the water, but it has to do the drinking itself. I am afraid that this simple truth is in stark contrast to the numerous lulling advertisements being published in such abundance. They cleverly reinforce themselves with slogans like Switching to Pascal solves all your (programming) problems and Our Computer speaks Pascal, and, in fact, represent nothing more than an extremely aggressive sales campaign.

Sooner or 1ater, people will, through weight. Sometimes, redesigning these programs led to drastic. even tenfold. reduction in their size and complexity. I sadly realized that a high-level programming language could not only be used to design beautiful programs with much less effort. but also to hide incompetence underneath an impressive coating of glamour and frills. The analogy to literature became all too evident. We must do our best to avoid the misuse of modern programming languages for the selling of lousy contents through enticing packaging. Style may be essential to achieve a good design, but ultimately it is the design, and not the style, that counts.

Let me emphasize the point: neither owning a computer nor programming in a modern language will itself solve any harsh experiences, realize that they have become victims of slogans and fads, and that owning the best of tools is worthless unless that tool is thoroughly understood. I am afraid that the modern trend of overselling can become counterproductive. I have seen progressive teachers proudly offering their students the chance to learn structured Pascal, and I quickly realized that the students had no inkling of what structure meant. And I have seen professional programmers proudly present Pascal programs abounding with neatly indented structures, comments (for documentation, of course), and lots of procedures and sophisticated data types. Upon closer inspection, however, the baroque nomenclatures and structures revealed themselves as dead problems, not even yours. But it may be instrumental. Predominantly, I have noticed. more effort is spent on obtaining those instrumental tools than on mastering them. And this is a grave mistake. Perhaps the most effective precaution against it is this rule: Know what the tool is to achieve and what you are going to use it for before you acquire it. This holds for language as well as computers--the more sophisticated it is, the more effort you will need for its mastery, the bigger will be the chance for its misuse, but, presumably, the higher the ultimate reward. I hope that this reward is not only measured in terms of problems solved and dollars earned, but also in the learners' satisfaction of having gained understanding, ability, and genuine insight.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.