A description of the first Open Source / Open Science conference held at Brookhaven National Laboratory this autumn, which brought figures from the open source movement face-to-face with computational scientists and engineers. You can also visit the Open Source / Open Science '99 site.
Greg Wilson
October 01, 1999
URL:http://www.drdobbs.com/a-natural-home-for-open-source/184411210
The first Open Source / Open Science conference was held at Brookhaven
National Laboratory (BNL) on Long Island on October 2. Its principal
purpose was to bring together open source developers and scientific
application programmers who are using, or could benefit from, Open
Source software.
In many ways, science is the original open source project. From
Galileo's day onward, science has separated authorship from ownership,
giving scientists credit for new theories or experimental results,
without according them ownership. Using a term popularized by Eric
Raymond, science is a "gift economy", whose members' principal reward
is their peers' recognition of their contributions. Open source,
therefore, ought to come naturally to scientists.
Peer review is a second reason for scientists to move to open source
development. Reproducibility is the basis of all experimental
science: an experimental result is considered valid only if a
disinterested party could reproduce it based on the published
description. In many cases, the only adequate description of a
computational simulation is the program that was run. Sharing source
code therefore allows scientists to review, as well as extend, one
another's work.
A third, equally pragmatic factor that operates in favor of open
source is the small size of most scientific communities. While Linux
has millions of potential users, no more than a handful of people will
ever want to use a package that simulates calcium deposition on
colloidal substrates. As shown by the divisions between tcsh and
bash, or KDE and GNOME, the open source model by itself does not
prevent redundant effort. However, the absence of zero-sum commercial
pressures does make it easier for interested parties to find common
ground, or at least to leverage one another's good ideas.
The need for scientists to pool their efforts is particularly clear in
the field of high-performance computing. A decade ago, more than a
dozen vendors were selling parallel supercomputers of various
descriptions. Today, all of those companies have either folded or
been taken over. As Fred Johnson (on secondment from NIST to the
Department of Energy's Office of Science) observed, supercomputing is
moving back to the "roll your own" model of the 1960s. The key
difference today is that Beowulf-style machines can take advantage of
the price and performance of commercial off-the-shelf (COTS)
components, and of the robustness and tweakability of Linux, MPI, g++,
and other tools.
Supercomputer projects of this kind were the subject of talks by
Yuefan Deng of SUNY, and Tom Throwe of BNL, while two others ---
Malcolm Capel's on a crystallography project at BNL, and Bill Rooney's
on a medical imaging project --- described applications based wholly
or in part on open source software. In discussion afterward, Rooney
pointed out that one thing holding open source software back is the
the lack of accountability. If a hospital purchases a CAT scanner
from a commercial company, for example, the hospital does not take on
a liability cost due to possible bugs in the scanner's software, since
claims arising from such bugs will be borne by the software's authors.
If the same scanner uses open source software, however, the hospital
could be the one left holding the bag. One solution to this problem
might be the emergence of a re-insurance market for open source
software, i.e. of companies that inspect or test open source software,
then offer insurance against faults it may contain. Such companies
could equally well insure against server downtime, lost or
mis-processed data, and so on. By doing so, they would not only make
open source software more palatable for enterprise applications, but
also put pressure on vendors of closed source software to quantify the
costs of their unreliability.
Of course, when the advocates of open source start talking about
putting pressure on vendors, the vendor they have in mind is
Microsoft. As at every other open source gathering I've attended,
there was a lot of gratuitous Microsoft-bashing. (One Linux devotee
at the conference told me that he didn't think DDJ should print
articles about Windows programming. I pointed out that this is what
85% of programmers write code for, but I don't think I changed his
mind.) I don't enjoy looking at the Blue Screen of Death any more
than the next person, but I think open source zealots are doing
themselves, scientists, and the general public a disservice by
defining themselves in terms of who they're not, and by turning their
noses up at technologies such as COM (the only widely-used component
model in the world) simply because they were born in Redmond.
Open source advocacy made up the bulk of Bruce Perens' talk, titled
"What is Open Source?" He was followed by Dan Gezelter, a chemist
from Notre Dame, who gave what I considered the most interesting talk
of the day. Gezelter discussed how the notions of peer review and
publication credit could be applied to open source software. In
particular, he argued that scientists should cite the software that
they use, just as they cite other scientists' research, and that
scientists should be given the same credit for such citations as they
are given for citations of the articles they publish. Among other
things, citation would encourage scientists to review software in the
same way that they review papers, which would undoubtedly help improve
its quality. (As one participant observed, a bug in Linux makes your
computer crash. A bug in your eigenvalue routine, on the other hand,
might only make your results wrong in the eighth decimal place, which
is harder to spot.)
Two other talks that I found almost as interesting were Kent
Koeninger's discussion of how SGI decided to make its XFS file system
open source, and Bill Horn's discussion of OpenDX, IBM's newly opened
scientific data-visualization package. Koeninger described XFS as
SGI's crown jewels, and talked about the difficulty of persuading
people inside SGI that it actually made sense to put one of its key
technologies out in the open, where business competitors could inspect
it. While OpenDX is a niche product by comparison, Horn's talk was
yet more evidence that the corporate world is beginning to take the
open source model seriously, as was Jon Leech's talk on OpenGL and
GLX. (Slides from Jon Leech's talk are available at http://reality.sgi.com/opengl/talks/osos99/.) Finally, Mark Galassi described the process that has produced
the GNU Scientific Library (GSL), an open source numerical library
that duplicates much of the functionality of the "Numerical Recipes"
codes, without any of the licensing hassles.
The conference closed with a panel discussion moderated by Jon
"Maddog" Hall. Many of the questions from the audience were about
intellectual property rights, funding, and other day-to-day concerns
of working scientists that open source development might affect.
I enjoyed the conference overall, but felt it would have benefited
from a tighter focus. There are plenty of venues in which to discuss
how to cool multi-CPU Linux clusters, for example, but many fewer for
talks such as Gezelter's or Koeninger's. I was also disappointed that
the twin issues of training and quality received so little attention.
A major reason for the success of Linux, Apache, and similar projects
is that the excitement of working out in the open has engaged the
attention of some of the best programmers around. While the bazaar
model has proved effective in the hands of such uber-nerds, it is more
likely to result in chaos when practiced by programmers who have spent
the last few years thinking about quantum chemistry, rather than about
the pro's and con's of multiple inheritance. Those gripes aside, I
look forward to the next conference in the series, and to seeing how
open source development fares in its most natural environment.
These op/eds do not necessarily reflect the opinions of the author's
employer or of Dr. Dobb's Journal. If you have comments, questions,
or would like to contribute your own opinions, please contact us at
[email protected].
A Report on "Open Source / Open Science '99"