From The Chronicle
dated October 29, 1999.
By MICHAEL JENSEN
If you want elbow room at a party, just start discussing
the cultural significance of computer software. Even the most
convivial kitchen suddenly becomes vacant when you mention
emphasizing the value of "open source" tools in the non-profit
sector.
But wait, come back. This isn't a party, this is The
Chronicle, and it's worth hearing me out. I'm not just
indulging in "geekspeak," but am talking about something that
could have a dramatic impact on our culture.
There have been numerous upheavals and technical
transformations over the 15 years that I have been involved
with the computer technology of publishing, yet the
transformations in the last few years have been by far the
most dramatic. Every age seems revolutionary, I know, but I am
convinced that the technical choices we make over the next
three years, as individuals and as institutions, will have
repercussions for decades.
We need to decide what kind of relationship academe should
have with the tools that underpin its knowledge bases -- that
of a huge corporate customer that goes to private industry for
software, or of a supporter and underwriter of open and free
software tools that serve our needs.
The Internet has enabled broadly dispersed software developers
to collaborate on creating such free and accessible software
to meet specific needs. The most famous example is Linux. That
operating system was collaboratively developed around a kernel
originally written by Linus Torvalds, who was seeking to solve
some of the problems he was having with the operating system
Unix. Equally significant is Apache, a Web server that, with
Linux underpinning it, accounts for the foundation of over
half of all World-Wide Web sites.
The term "open source" means that the source code -- the
programs, readable by human beings, that are compiled to make
executable programs -- can be viewed, modified, and then
recompiled for one's own purposes. An open-source program like
Linux has had thousands of people check over (and often
modify) its source code to improve efficiency, stability, or
security. All changes are confirmed, either by the consent of
the group of users or by a key individual like Torvalds,
before becoming a permanent part of the source code. "With
enough eyes, every bug is shallow," is sometimes called
"Linus's Law": It means that, with enough programming egos
involved, every line of open-source code ends up being tight.
Proprietary software, by contrast, cannot be modified by
users, but can be changed only by its owner. The source code
remains a closely guarded secret, and it is to the owner's
advantage to make modifications only occasionally -- and to
market the changes with an "update fee."
I am not against proprietary software per se, but I do try to
use stable, open-source tools whenever possible. In the long
run, I believe, doing so best serves the things I hold dear --
education, knowledge, and the public's ability to gain access
to both.
Several issues should be considered. First, we should remember
that we in academe exist in the non-profit sector. The
president of the University of Arizona, Peter W. Likins,
recently noted the difference at a meeting at the Online
Computer Library Center in Ohio: "A for-profit's mission is to
create as much value for its stockholders as possible, within
the constraints of society. The non-profits' mission is to
create as much value for society as possible, within the
constraints of its money."
I've found that neat distinction useful. If the goal of most
of our ".org" and ".edu" organizations in academe and academic
publishing is to create value for society, then, as members of
the educational enterprise, we must take full advantage of our
strengths: our commitment to shared knowledge, our mission to
facilitate understanding, and our insistence on the correct
(rather than the most popular or prettiest) solution.
Second, we need to understand what writing computer programs
-- and then perfecting them -- entails. Most people think of
computer programs as magic black boxes (or is that black-magic
boxes?), which are too complicated for the uninitiated to
comprehend. That is partly because, for many years,
computer-programming languages were abstruse and almost
intentionally arcane (many still are). But that is beginning
to change. Computer programs aren't really black boxes;
they're engines that use the fuel of content to generate
power.
Programs allow plastic solutions to be crafted for concrete
problems: They are malleable, and can be refined and applied
to new problems, each time dramatically decreasing how long it
takes to find a new solution.
A flawed, but still useful, rule of thumb is that it takes a
programmer a year to write a completely new program, a month
to adapt it to a new task, a week to change it a third time,
and a day the fourth -- as long as the basic code can be
reused each time. The word-indexing program I write in Perl
(an open-source program that can easily be used to write other
programs) to solve one problem can be tailored to other kinds
of textual indexing problems, and each modification can be
created ever more quickly.
The implications of that rule of thumb are enormous,
particularly when it comes to open-source tools.
There are a handful of tools that I use every day, with which
I've gotten quite proficient. I now can do quite intricate
manipulations of huge amounts of text very quickly, almost on
a whim, for two reasons: I know the capabilities and limits of
the programs intimately, and I have a large collection of
previously written programs from which I can snatch bits of
code.
That last point is key, because it's a microcosm of the larger
process that I hope to encourage with this article. With
open-source programs, not only do programmers have their own
craftwork to reuse, but they also have the craft of others.
Here at the National Academy Press
(http://www.nap.edu), we
are developing tools that will be made open source when they
are finalized, and we use open-source programs whenever
possible. Our programs for our Web-based shopping cart, for
example, use open-source Perl code to perform mundane tasks
such as communicating with a data base. That allows our
programmer to spend his time developing other, less-common
code -- for instance, to conditionally display a book-cover
image within an order.
Because it's open source, we expect other publishers to view
the program, and we hope others will add innovations. Our 1.0
version is intended for our kind of scientific books and data
base, but the 2.0 version might be more easily tailored to
other data bases, and perhaps the 3.0 version will contain
tools for collecting and organizing diverse articles or
chapters for print-on-demand collections. If some other
publisher has a greater urgency to write any part of the
program before we get to it, we'd be delighted.
That kind of sharing builds a sense of community. It can mean
having hundreds, even thousands of craftspeople designing an
ideal set of tools: Think of it as information sculpture.
It can also save big bucks. Alan Kay, the pioneering systems
designer (once at Xerox PARC and Apple, and now at Disney) is
in the process of unveiling a truly astonishing framework for
developing Web-ready multimedia. It's called Squeak, and it's
an open-source project based on an earlier open-source project
developed at Xerox PARC. Kay and a small team of developers at
Disney Imagineering have honed the core of the tool, while a
looser group of developers worldwide -- "people I have never
physically met," says Kay -- have created the means to run it
on different operating systems (a process called "writing a
port").
Kay estimates that using the open-source approach has saved
the Squeak project between $5- and $7-million; it has also
created a far better program, which will be used by far more
people, than anything that is not open source. Programmers
with the chops to write a port of a system like Squeak are
exceedingly scarce, and the fact that they willingly undertook
to help Kay is a testament to the open-source movement. Such
people would never have volunteered their time and skill as a
gift to a for-profit enterprise. But Squeak is free,
open-source software that is likely to improve the world.
That's something a real programmer can sink his teeth into.
Finally, we should remember that research and scholarship are
fundamentally open-source enterprises. The research on which
scholarship builds is always cited; the methodology of any
study is explained as a repeatable framework; and theoretical
presumptions are made clear as part of any published argument.
Those principles generally lead authors to be careful, and
help other scholars to test an author's conclusions.
In the realm of software, only open-source tools make their
underpinnings readable. In many ways, only open-source
software fits philosophically with the fundamentals of
scholarship.
Universities and research institutions are heavy users of
software. Proprietary systems have their place -- I don't mean
to say we should never again use a copyrighted program like
Microsoft Word -- but such programs should be chosen
intentionally, rather than accepted as inevitable.
If our graduates are predominantly trained in open-source
tools, the world's open-source library will grow and improve.
If every grant from the National Science Foundation presumes
that the resulting programs will be open source (unless a case
is made against doing so), better resources will be developed.
As our university programmers develop open-source solutions to
common problems (such as developing the underpinnings for a
data base of sound clips, or a self-teaching spell-checker, or
a content-mining software agent), then other people at other
institutions can see how it was done, be saved the expense of
reinventing the wheel, perhaps improve the code, and help to
create at least a slightly improved world.
To encourage a move away from keeping knowledge secret can
only do our society good. By supporting the open-source
culture, we can make sure that the nudging we do to the
momentum of our digital culture is aimed in the right
direction.
Michael Jensen is director of publishing technologies at the
National Academy Press.
Return to Related Readings