Dr. Dobb's | A Natural Home for Open Source

A Natural Home for Open Source

A description of the first Open Source / Open Science conference held at Brookhaven National Laboratory this autumn, which brought figures from the open source movement face-to-face with computational scientists and engineers. You can also visit the Open Source / Open Science '99 site.
Greg Wilson

October 01, 1999
URL:http://www.drdobbs.com/a-natural-home-for-open-source/184411210

Conference Report: October 8, 1999: A Natural Home for Open Source

A Report on "Open Source / Open Science '99"

The first Open Source / Open Science conference was held at Brookhaven National Laboratory (BNL) on Long Island on October 2. Its principal purpose was to bring together open source developers and scientific application programmers who are using, or could benefit from, Open Source software.

In many ways, science is the original open source project. From Galileo's day onward, science has separated authorship from ownership, giving scientists credit for new theories or experimental results, without according them ownership. Using a term popularized by Eric Raymond, science is a "gift economy", whose members' principal reward is their peers' recognition of their contributions. Open source, therefore, ought to come naturally to scientists.

Peer review is a second reason for scientists to move to open source development. Reproducibility is the basis of all experimental science: an experimental result is considered valid only if a disinterested party could reproduce it based on the published description. In many cases, the only adequate description of a computational simulation is the program that was run. Sharing source code therefore allows scientists to review, as well as extend, one another's work.

A third, equally pragmatic factor that operates in favor of open source is the small size of most scientific communities. While Linux has millions of potential users, no more than a handful of people will ever want to use a package that simulates calcium deposition on colloidal substrates. As shown by the divisions between tcsh and bash, or KDE and GNOME, the open source model by itself does not prevent redundant effort. However, the absence of zero-sum commercial pressures does make it easier for interested parties to find common ground, or at least to leverage one another's good ideas.

The need for scientists to pool their efforts is particularly clear in the field of high-performance computing. A decade ago, more than a dozen vendors were selling parallel supercomputers of various descriptions. Today, all of those companies have either folded or been taken over. As Fred Johnson (on secondment from NIST to the Department of Energy's Office of Science) observed, supercomputing is moving back to the "roll your own" model of the 1960s. The key difference today is that Beowulf-style machines can take advantage of the price and performance of commercial off-the-shelf (COTS) components, and of the robustness and tweakability of Linux, MPI, g++, and other tools.

Supercomputer projects of this kind were the subject of talks by Yuefan Deng of SUNY, and Tom Throwe of BNL, while two others --- Malcolm Capel's on a crystallography project at BNL, and Bill Rooney's on a medical imaging project --- described applications based wholly or in part on open source software. In discussion afterward, Rooney pointed out that one thing holding open source software back is the the lack of accountability. If a hospital purchases a CAT scanner from a commercial company, for example, the hospital does not take on a liability cost due to possible bugs in the scanner's software, since claims arising from such bugs will be borne by the software's authors. If the same scanner uses open source software, however, the hospital could be the one left holding the bag. One solution to this problem might be the emergence of a re-insurance market for open source software, i.e. of companies that inspect or test open source software, then offer insurance against faults it may contain. Such companies could equally well insure against server downtime, lost or mis-processed data, and so on. By doing so, they would not only make open source software more palatable for enterprise applications, but also put pressure on vendors of closed source software to quantify the costs of their unreliability.

Of course, when the advocates of open source start talking about putting pressure on vendors, the vendor they have in mind is Microsoft. As at every other open source gathering I've attended, there was a lot of gratuitous Microsoft-bashing. (One Linux devotee at the conference told me that he didn't think DDJ should print articles about Windows programming. I pointed out that this is what 85% of programmers write code for, but I don't think I changed his mind.) I don't enjoy looking at the Blue Screen of Death any more than the next person, but I think open source zealots are doing themselves, scientists, and the general public a disservice by defining themselves in terms of who they're not, and by turning their noses up at technologies such as COM (the only widely-used component model in the world) simply because they were born in Redmond.

Open source advocacy made up the bulk of Bruce Perens' talk, titled "What is Open Source?" He was followed by Dan Gezelter, a chemist from Notre Dame, who gave what I considered the most interesting talk of the day. Gezelter discussed how the notions of peer review and publication credit could be applied to open source software. In particular, he argued that scientists should cite the software that they use, just as they cite other scientists' research, and that scientists should be given the same credit for such citations as they are given for citations of the articles they publish. Among other things, citation would encourage scientists to review software in the same way that they review papers, which would undoubtedly help improve its quality. (As one participant observed, a bug in Linux makes your computer crash. A bug in your eigenvalue routine, on the other hand, might only make your results wrong in the eighth decimal place, which is harder to spot.)

Two other talks that I found almost as interesting were Kent Koeninger's discussion of how SGI decided to make its XFS file system open source, and Bill Horn's discussion of OpenDX, IBM's newly opened scientific data-visualization package. Koeninger described XFS as SGI's crown jewels, and talked about the difficulty of persuading people inside SGI that it actually made sense to put one of its key technologies out in the open, where business competitors could inspect it. While OpenDX is a niche product by comparison, Horn's talk was yet more evidence that the corporate world is beginning to take the open source model seriously, as was Jon Leech's talk on OpenGL and GLX. (Slides from Jon Leech's talk are available at http://reality.sgi.com/opengl/talks/osos99/.) Finally, Mark Galassi described the process that has produced the GNU Scientific Library (GSL), an open source numerical library that duplicates much of the functionality of the "Numerical Recipes" codes, without any of the licensing hassles.

The conference closed with a panel discussion moderated by Jon "Maddog" Hall. Many of the questions from the audience were about intellectual property rights, funding, and other day-to-day concerns of working scientists that open source development might affect.

I enjoyed the conference overall, but felt it would have benefited from a tighter focus. There are plenty of venues in which to discuss how to cool multi-CPU Linux clusters, for example, but many fewer for talks such as Gezelter's or Koeninger's. I was also disappointed that the twin issues of training and quality received so little attention. A major reason for the success of Linux, Apache, and similar projects is that the excitement of working out in the open has engaged the attention of some of the best programmers around. While the bazaar model has proved effective in the hands of such uber-nerds, it is more likely to result in chaos when practiced by programmers who have spent the last few years thinking about quantum chemistry, rather than about the pro's and con's of multiple inheritance. Those gripes aside, I look forward to the next conference in the series, and to seeing how open source development fares in its most natural environment.

These op/eds do not necessarily reflect the opinions of the author's employer or of Dr. Dobb's Journal. If you have comments, questions, or would like to contribute your own opinions, please contact us at [email protected].