Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

C/C++

From C to C++: Interviews With Dennis Ritchie and Bjarne Stroustrup


DDJ: What was the rationale behind the decision to leave out the read, write, open, close, and create functions?

DR: Those functions are viewed as being quite specific to the Unix system. Other operating systems might have great difficulty in supplying things that work the same way those do. The idea of the original pre-ANSI standard I/O library was to make it possible to implement those I/O routines in a variety of operating systems. It took us a couple of tries to reach that particular interface. The machines we had here were the PDP-11 running Unix, a Honeywell 6000 running GECOS, and some IBM 360s running various IBM systems. We wanted to have standard I/O routines that could be used in all the operating systems, even those that didn't have anything like Unix's read and write. The committee felt that it was better to let the IEEE and other Unix standardization groups handle that. They specifically avoided putting things in the C library that were Unix-specific unless they had meaning in other systems.

They did another thing that people don't quite understand. They explicitly laid out the name space that a standard compiler is allowed to usurp or claim. In particular, the guarantee is that there is a finite list of names that the compiler and the compiler system take up. These are simple names, beginning with underscore, and are listed in the back of the standard, something like keywords. You are allowed to use any name that isn't on this list. In an ANSI-conforming world, you are allowed to define your own routine called read or write and even run it on a Unix system. It's guaranteed that this will be your routine and that your use of the name does not conflict with any I/O that the library itself does on your behalf. The Unix library authors will be constrained to have an internal name for read that you can't see so that if you bring a C implementation from a big IBM machine or from an MS-DOS machine and you happen to use the name read for your own routine, it will still compile and run on a Unix system even though there's a system call named read. They have circumscribed the so-called name space pollution by saying that the system takes these names and no others.

DDJ: How can they be sure, that an implementor won't need other than those specific names from the list?

DR: There are rules for how the internal people can generate names, namely these underscore conventions. The end user is not allowed to use underscore names because any of these might be used internally. There is a problem, though. They made a list of things and said they'll do this and no more, and that helps. But there is still a problem for the writer of a library who wants to sell or distribute it. You're in a bind because you don't know all the underscore names that all the implementations are going to use. If you have your own internal names, you can't be sure that they're not going to conflict somewhere. If they have underscores, they might conflict with the underlying implementation. If they don't, then they might conflict with things that your end users are going to use. The C committee did not solve the problem that other languages have tackled explicitly. There are other ways of controlling the name space problem. They made a convention that helps, but it certainly didn't solve the real problem. It solved it enough to improve the situation. The basic problem with an uncontrolled name space is that if you write a program, and it just uses some name that you made up, it may be actually difficult to find out that this is not the same name as some random routine that's used internally by your system library. Unfortunately, we here at Bell Labs are in a bad position to notice this and do something about it because in our group we've simultaneously developed the compiler and the library and the Unix system, and so people here tend to know the names.

So, to summarize X3J11, the two largest things the committee did were function prototypes and the standardization of the library. It was more work than anybody expected, but I'm perfectly happy with what they did. The only problem was it took twice as long as they thought.

DDJ: Do you see a potential for other standard extensions to C, beyond those added by X3J11?

DR: One of the major excuses they give for not doing something is that there's no practice, no prior art. So obviously people will try to create prior art for the things they'd like to have happen. One such group is the Numerical C Extensions Group.

There are people who have strong views about what should happen. The idea is to get together with this group and agree that these are the things we need to have, so let's make some rules so that when people try things out, we'll all be trying it the same way, and we'll have a coherent story to tell, if there are going to be these extensions.

Most of them have to do with IEEE arithmetic issues, exceptions and such, for example. There's a core of things that are more general, and one that interests me is variable arrays with adjustable sizes. One of the things C does successfully is deal with single-dimension arrays that can be variable in size, but it doesn't deal with multi-dimensioned arrays that are variable at all. This is an important lack for numeric types, because it makes it hard to write library routines that manipulate arrays. Multiplying two arrays is a bit painful in C if the arrays are variable in size. You can do it but you have to program it in detail and the interface doesn't look pleasant. That's an obvious need, and I volunteered to look at how it might be done.

The NCEG will probably try to become official. They will affiliate themselves either with X3J11 or as an IEEE standardization organization. This would give them more clout. Also, many of the companies involved worry about legal issues. Companies who are members of informal groups deciding standards worry about anti-trust, whereas if they are members of official, blessed standards organizations, then they can contribute. They worry about being accused of going off into a corner and doing things behind other people's backs. It's better to do it in the open. This may be just some lawyer's nightmare. NCEG will probably become a subcommittee of X3J11.

DDJ: One non-ANSI extension to C is C++, a superset language that surrounds C with disciplines and paradigms that go beyond its original intent as a procedural language. Can you comment on how appropriate that is and how successful it has been?

DR: Let me confess at the start that I know less about C++ than I probably should. C is a very low-level language on a variety of fronts. The kinds of operations that it performs are quite basic. The control over names and visibility is basic. The defects or limitations of C in this area are most evident when you get into a large project where you need strong standards, rules, and mechanisms outside the language. Language developments such as C++ are trying to supply some of the structure within the rules of the language for controlled visibility of name space and are trying to encourage various kinds of modularization. This is good, I suppose.

C was designed in an environment where modularity was encouraged not so much by the language but by the kinds of programs we wrote. In the Unix system, the tradition is for small utilities that work together as tools, and the interfaces between them were set by the conventions and rules of the operating system, i.e., pipelines and so forth. The complexity of the pieces was kept low by custom. Commands tend to be simple. In the world today, there's a certain amount of admiration for that point of view. Certainly the appreciation for that style is part of the reason for the growth of Unix. People now are undertaking the building of much bigger systems, and things that we handled by convention ten or fifteen years ago must be handled by more explicit means. C++ is one such attempt.

Bjarne decided to design a compatible superset of C and to translate the C++ language into C code. That approach is not without its problems. First, having decided that C++ is going to be largely compatible with C, every time he departs from that he's under pressure either because of some accident or because ANSI changed something. Or because he feels that there's something he has to differ in, people are going to complain and get confused. Second, he is constrained by the choice to make a C++ to C translator possible, that is, he is constrained, as C was, by the existing tools of the various systems. The whole separate compilation business in C++ is made a lot harder by the desire to make it work with existing tools. If he could have simply designed a language and implemented it, then a lot of the anguish would have been avoided.

DDJ: Rumor is that within Bell Labs, C++ is now called C, and C is called "old C." Any truth in that?

DR: I've asked Bjarne not to say "old C," and, as far as I know, he has complied with that request.

DDJ: Colleges and universities have started offering courses in C. Some C tutors have observed that many instructors either don't understand C well enough or they don't understand teaching well enough to insulate the novice student from the kinds of things you can do in C, things that the student cannot grasp. In light of that, and as compared to Pascal, how do you view C as a potential teaching language?

DR: Obviously, C was never designed to be a teaching language. It was designed as a tool to express the kind of programs that we were trying to write at the time. And it's fairly low level in that concepts, like pointers, have a prominent role. I would not argue that C is a particularly good language for teaching programming. As Pascal was explicitly designed for that.

Pascal's main fault is that you cannot use Pascal originally designed to express all the things you need to, certainly not in a systems environment, and not for general applications either because of explicit constraints that are built into the language. C was, from the very start, designed to do all the things that we found necessary in order to express ourselves, and little design thought was given to preventing people from using its powerful features.

Nevertheless, it's possible to teach C in a way that's reasonably safe if you start with parts of the language that are similar to other procedural languages. Then you can teach C's more unusual aspects -- pointers, for example -- as cliches or set ways of expressing array manipulations and so forth. Later you can gradually widen out into the more general things possible with pointer manipulations.

I have not had the experience that the tutors have had. Part of the difficulty with being in a position like this is that you have very little opportunity to see what the novice really feels. But perhaps the reason there are not better instructors is that things have grown fast, and there might be people teaching C who only recently took the introductory course on the language themselves.

DDJ: Would you attempt a prediction for the future of the C language?

DR: I think the period of C's largest growth is over, although it will be increasingly used and it probably will not change very fast. The new language developments based on C will be on successors such as C++ or perhaps some things we haven't heard of. In terms of what C tried to do, I think it succeeded fairly well. The goals were reasonably modest. There's still plenty of work to be done finding languages that have the touch of reality that C has, work where you handle real problems in real environments as opposed to dealing with elegant creations that can't be used. Sometimes things can't be used just because the compilers don't exist on the machines people have. Sometimes it's because there are simply flaws in the design, not from the language point of view, but from the point of view of what the language ends up doing in the real world. And in that respect, C seems to have worn fairly well.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.