Darwin, Linux, and Radiation

10/16/2000

In the aftermath of LinuxWorld, the open source conference that took place in San
Jose, Calif., in August, we’re now being treated with press releases announcing Linux
as Almost Ready for the Desktop.

It is not.

Even if Linux were to achieve double-digit penetration among the world’s PC users, it
would be little more than an also-ran desktop OS. For Linux, the real action is
elsewhere. If you want to understand why Linux is the most important operating system in the world, ignore the posturing about Linux on the desktop, and pay attention to the fact that IBM has just ported Linux to a wristwatch, because that is the kind of news that illustrates Linux’s real strengths.

At first glance, Linux on a wristwatch seems little more than a gimmick–cellphone
displays and keypads seem luxurious by comparison, and a wristwatch that requires you to type “date” at the prompt doesn’t seem like much of an upgrade. The real import of the Linux wristwatch is ecological, though, rather than practical, because it
illustrates Linux’s unparalleled ability to take advantage of something called
“adaptive radiation.”

Let’s radiate

Adaptive radiation is a biological term that describes the way organisms evolve to
take advantage of new environments. The most famous example is Darwin’s finches. A single species of finch blew off of the west coast of South America and landed on the Galapagos Islands, and as these birds took advantage of the new ecological niches offered by the islands, they evolved into several separate but closely related species.

Adaptive radiation requires new environments not already crowded with competitors and organisms adaptable enough to take advantage of those environments. So it is with Linux–after a decade of computers acting as either clients or servers, new classes of devices are now being invented almost weekly–phones, consoles, PDAs–and only Linux is adaptable enough to work on most of them.

In addition to servers and the occasional desktop, Linux is being modified for use in
game machines (Indrema), Internet appliances (iOpener, IAN), handhelds (Yopy, iPAQ), mainframes (S/390), supercomputers (Los Lobos, a Beowulf cluster), phones (Japan Embedded Linux Consortium), digital VCRs (TiVO), and, of course, wristwatches. Although Linux faces fierce competition in each of these categories, no single competitor covers every one. Furthermore, given that each successful porting effort increases Linux’s overall plasticity, the gap between Linux’s diversity and that of its competitors will almost inevitably increase.

Where ‘good’ beats ‘best’

In a multidevice world, the kernel matters more than the interface. Many commentators (including Microsoft) have suggested that Linux will challenge Microsoft’s desktop monopoly, and among this camp it is an article of faith that one of the things holding Linux back is its lack of a single standardized interface. This is not merely wrong, it’s backward–the fact that Linux refuses to constrain the types of interfaces that are wrapped around the kernel is precisely what makes Linux so valuable to the individuals and companies adapting it for new uses. (The corollary is also true–Microsoft’s attempt to simply repackage the Windows interface for PDAs rendered early versions of WinCE unusable.)

Another lesson is that being merely good enough has better characteristics for adaptive
radiation, and therefore for long-term survival, than being Best of Breed.

Linux is not optimized for any particular use, and it is improved in many small
increments rather than large redesigns. Therefore, the chances that Linux will become a better high-availability server OS than Solaris, say, in the next few years, is tiny. Although not ideal, Linux is quite a good server, whereas Solaris is unusable for game consoles, digital VCRs, or wristwatches. This will keep Linux out of the best of breed competition because it is never perfectly tailored to any particular environment, but it also means that Linux avoids the best of breed trap. For any given purpose, best of breed products are either ideal or useless. Linux’s ability to adapt to an astonishing array of applications means that the chances of it being able to run on any new class of device are superior to a best of breed product.

The real action

The immediate benefits of Linux’s adaptive radiation ability are obvious to the Linux
community. Since nothing succeeds like success, every new porting effort increases both the engineering talent pool and the available code base. The potential long-term benefit, though, is even greater. If a Linux kernel makes interoperation easier, each new Linux device can potentially accelerate a network effect, driving Linux adoption still faster.

This is not to say that Linux will someday take over everything, or even a large subset
of everything. There will always be a place for “Best of Breed” software, and Linux’s
use of open protocols means its advantage is always in ease of use, never in locking out the competition. Nevertheless, only Linux is in a position to become ubiquitous across most kinds of devices. Pay no attention to the desktop sideshow–in the operating system world, the real action in the next couple of years is in adaptive radiation.

Time to Open Source the Human Genome

9/8/1999

The news this week that a researcher has bred a genetically smarter mouse is another precursor to a new era of genetic manipulation. The experiment, performed by Dr. Joe Tsien of Princeton, stimulated a gene called NR2B, producing mice with greatly heightened intelligence and memory. The story has generated a great deal of interest, especially as human brains may have a similar genetic mechanism, but there is a behind- the-scenes story which better illuminates the likely shape of 21st century medicine. Almost from the moment that the experiment was publicized, a dispute broke out about ownership of NR2B itself, which is already patented by another lab. Like an information age land-grab, whole sections of genetic information are being locked behind pharmaceutical patents as fast as they are being identified. If the current system of private ownership of genes continues, the majority of human genes could be owned in less than two years. The only thing that could save us from this fate would be an open- source movement in genetics — a movement to open the DNA source code for the human genome as a vast public trust.

Fighting over mouse brains may seem picayune, but the larger issue is ownership of the genetic recipe for life. The case of the smarter mouse, patent pending, is not an
isolated one: among the patents granted for human genes are blindness (Axys
Pharmaceuticals), epilepsy (Progenitor), and Alzheimer’s (Glaxo Wellcome). Rights to the gene which controls arthritis will be worth millions, breast cancer, billions, and
the company that patents the genes that control weight loss can write their own ticket. As the genome code becomes an essential part of industries from bio-computing and cybernetic interfaces to cosmetics and nutrition, the social and economic changes it instigates are going to make the effects of the Internet seem inconsequential. Unfortunately for us, though, the Internet’s intellectual property is mostly in the public domain, while the genome is largely — and increasingly — private.

It didn’t have to be this way. Patents exist to encourage investment by guaranteeing a
discoverer time to recoup their investment before facing any competition, but patent
law has become increasingly permissive in what constitutes a patentable discovery. There are obviously patents to be had in methods of sequencing genes and for methods of using those sequences to cure disease or enhance capability. But to allow a gene itself to be patented makes a mockery of “prior art,” the term which covers unpatented but widely dispersed discoveries. It is prior art which keeps anyone from patenting fire, or the wheel, and in the case of genetic information, life itself is the prior art. It is a travesty of patent law that someone can have the gene for Alzheimer’s in every cell of their body, but that the patent for that gene is owned by Glaxo Wellcome.

The real action, of course, is not in mouse brains but in the human genome. Two teams, one public and one private, are working feverishly to sequence all of the 100,000 or so genes which lie within the 23 pairs of human chromosomes. The public consortium aims to release the sequence into the public domain, while the private group aims to patent much of the genome, especially the valuable list of mutations that cause genetic disease. The competition between these two groups has vastly accelerated the pace of the work — moving it from scheduled completion in 2005 to next year — but the irony is that this accelerated timetable won’t give the public time to grasp the enormous changes the project portends. By this time next year, the fate of the source code for life on earth — open or closed — will be largely finished, long before most people have even begun to understand what is at stake.

View Source… Lessons from the Web’s Massively Parallel Development

First published April 1998.

How did Web grow so quickly?

In the 36 months from January of 1993 to December of 1995, HTML went from being an unknown protocol to being to being the pre-eminent tool for designing electronic interfaces, decisively displacing almost all challengers and upstaging online services, CD-ROMs, and a dozen expensive and abortive experiments with interactive TV, and it did this while having no coordinated center, no central R&D effort, and no discernible financial incentive for the majority of its initial participants.

Ragtag though it was, the early Web, with its endless compendia of ‘Cool Bands’, badly scanned photos of pets, and ubiquitous “Favorite Links” lists pointing to more of same, was a world-building excercise which provided an unequalled hothouse lab for structuring information and designing interfaces. By the time the corporate players and media companies considered getting involved, many of the basic interface issues – the idea of Home pages, buttons and toolbars, the standards for placement of things like logos, navigational elements, help menus and the like had already been worked out within a vast body of general practice generated by the pioneers.

More than any other factor, this ability to allow ordinary users to build their own Web sites, without requiring that they be software developers or even particularly savvy software users, caused its rapid rise. Web design, and in particular Web site design with its emphasis on architecture, interactivity, and structuring data in an implicit order or set of orders, is not graphic design but rather constitutes a special kind of low-level engineering. Prior to the invention of HTML, all software packages which implemented any sort of hypertext required the user to learn a programming language of some sort (Director, Hypercard). HTML, in contrast, allows users to create links without requiring an ablity to write code.

INTERFACE ENGINEERING

Call HTML’s practitioners interface designers, information architects or, in keeping with the idea of world-building, even citizen-engineers, these are people who can think creatively about information as media, without necessarily being comfortable looking down the business end of a compiler. This becomes Principle 0 for this essay:

#0. Web site design is a task related to, but not the same as, software engineering.

Any project which offers a limitless variety of ways in which information can be presented and stuctured can happen both faster and better if there is a way for designers and engineers to collaborate without requiring either group to fundamentally alter their way looking at the world. While it can’t hurt to have a designer see source code or a developer open a copy of Photoshop once in a while, environments which let each group concentrate on their strengths grow more rapidly than environments which burden designers with engineering concerns or force their engineers to design an interface whenever they want to alter an engine. The Web exploded in the space of a few years in part because it effected this separation of functions so decisively.

As opposed to software which seeks a tight integration between creation, file format, and display (e.g. Excel, Lotus Notes), HTML specifies nothing about the tools needed for its creation, validation, storage, delivery, interpretation, or display. By minimizing the part of the interface that is the browser to a handful of commands for navigating the Web, while maximizing the part of the interface that is in the browser (the HTML files themselves), the earliest experiments with the Web took most of the interface out of the browser code and put it in the hands of any user who could learn HTML’s almost babyish syntax.

With Web design separated from the browser that was displaying it, countless alternative ways of structuring a site could be experimented with – sites for online newspapers, banks, stores, magazines, interfaces for commerce, document delivery, display, propaganda, art – all without requiring that the designers arrange their efforts with the browser engineers or even with one another.

This separation of interface and engineering puts site design in a fundamentally new relationship to the software that displays it:

#1. An interface can be integrated with the information it is displaying, instead of with the display software itself.

This transfer of the interface from ‘something that resides in the software and is applied to the data’ to ‘something that resides in the data and is applied to the software’ is the single most important innovation of the early Web.

REDUCING REQUIRED COORDINATION

Site designers with training in visual or conceptual aspects of organizing and presenting information can design for a Web browser without working with, or even needing to talk to, the people who designed the browser itself. Newsweek has to coordinate with America Online in order to make its content available through AOL, but it does not have to coordinate with Netscape (or the end-user’s ISP) to make its content available through Netscape.

This move away from centralized coordination and towards lateral development is good for the Web, and that some basic principles about the ways this is acheived can be derived from looking at the HTML/Browser split as one instance of a general class of “good” tools. The basic rule of good tools is:

#2. Good tools are transparent.

Web design is a conversation of sorts between designers, with the Web sites themselves standing in for proposal and response. Every page launched carries an attached message from its designer which reads “I think this is a pretty good way to design a Web page”, and every designer reacting to that will be able to respond with their own work. This is true of all design efforts – cars, magazines, coffee pots, but on the Web, this conversation is a riot, swift and wide ranging.

THE CENTRALITY OF OPEN HTML SOURCE TO THE WEB’S SUCCESS

The single factor most responsible for this riot of experimentation is transparency – the ability of any user to render into source code the choices made by any other designer. Once someone has worked out some design challenge, anyone else should be able to adopt, modify it, and make that modified version available, and so on.

Consider how effortless it would have been for Tim Berners-Lee or Marc Andreeson to treat the browser’s “View Source…” as a kind of debugging option which could have been disabled in any public release of their respective browsers, and imagine how much such a ‘hidden source’ choice would have hampered this feedback loop between designers. Instead, with this unprecedented transparency of the HTML itself, we got an enormous increase in the speed of design development. When faced with a Web page whose layout or technique seems particularly worth emulating or even copying outright, the question “How did they do that?” can be answered in seconds. 

“X DOES Y”, NOT “X IS FOR Y”

This ability to coordinate laterally, for designers to look at one anothers work and to experiment with it without asking permission from a central committee, is critical to creating this speed of innovation. Once the tools for creating Web pages are in a designers hand, there is no further certification, standardization or permission required. Put another way:

#3. Good tools tell you what they do, not what they’re for.

Good, general-purpose tools specify a series of causes and effects, nothing more. This is another part of what allowed the Web to grow so quickly. When a piece of software specifies a series of causes and effects without specifying semantic values (gravity makes things fall to the ground, but gravity is not for keeping apples stuck to the earth’s surface, or for anything else for that matter), it maximises the pace of innovation, because it minimizes the degree to which an effect has to be planned in advance for it to be useful.

The best example of this was the introduction of tables, first supported in Netscape 1.1. Tables were originally imagined to be just that – a tool for presenting tabular data. Their subsequent adoption by the user community as the basic method for page layout did not have to be explicit in the design of either HTML or the browser, because once its use was discovered and embraced, it no longer mattered what tables were originally for, since they specified causes and effects that made them perfectly suitable in their new surroundings.

A corollary to rule #3 is:

#3b. Good tools allow users to do stupid things.

A good tool, a tool which maximizes the possibilities for unexpected innovation from unknown quarters, has to allow the creation of everything from brilliant innovation through workmanlike normalcy all the way through hideous dreck. Tools which try to prevent users from making mistakes enter into a tar pit, because this requires that in addition to cause and effect, a tool has to be burdened with a second, heuristic sense of ‘right’ and ‘wrong’. In the short run, average quality can be raised if a tool intervenes to prevent legal but inefficent uses, but in the long haul, that strategy ultimately hampers development by not letting users learn from their mistakes.

THE INDEPENDENT RATE OF DEVLOPMENT

The browser/HTML combination as described not only increases the number of people who can work on different functions by reducing the amount of cross-training and coordination needed between designers and engineers (or between any two designers), it also changes the speed at which things happen by letting designers develop different parts of the system at independant rates.

By separating Web site design so completely from browser engineering, the rate of development of Web sites became many times faster than the rate of browser-engineering or protocol design.

Reflecting on these extremes is instructive, considered in light of the traditional software release schedule, which has alterations to engineering and interface change appearing together, in relatively infrequent versions. In the heat of structuring a site, a Web design team may have several people altering and re-examining an interface every few minutes. At the opposite extreme, in the three years from 1993 to 1995, the http protocol was not changed once, and in fact, even in 1998, the majority of Web transactions still use http 1.0. In between these extremes come rates of change from more to less rapid: changes in site structure (the relation between file storage and pointers), new versions of Web browsers, and new specifications for HTML itself. Designers could create, display, alter, beg, borrow, and steal interfaces, as fast as they liked, without any further input from any other source.

CONCLUSION

The Web grew as quickly as it did because the independent rate of site design, freed from the dictates of browser engineering, was much faster than even its inventors had predicted. Tools that allowed designers to do anything that rendered properly, that allowed for lateral conversations through the transparency of the HTML source, and removed the need for either compiling the results or seeking permission, certifcation or registration from anyone else led to the largest example of parallel development seen to date, and the results have been world-changing.

Furthermore, while there were certainly aspects of that revolution which will not be easily repeated, there are several current areas of inquiry – multi-player games (e.g. Half Life, Unreal), shared 3D worlds (VRML, Chrome), new tagset proposals (XML, SMIL), new interfaces (PilotOS, Linux), which will benefit from examination in light of the remarkable success of the Web. Any project with an interface likely to be of interst to end users (create your own avatar, create your own desktop) can happen both faster and better if these principles are applied.

In Praise of Evolvable Systems

(First appeared in the ACM’s net_worker, 1996)

Why something as poorly designed as the Web became The Next Big Thing, and what that means for the future.

If it were April Fool’s Day, the Net’s only official holiday, and you wanted to design a ‘Novelty Protocol’ to slip by the Internet Engineering Task Force as a joke, it might look something like the Web:
The server would use neither a persistent connection nor a store-and-forward model, thus giving it all the worst features of both telnet and e-mail.
The server’s primary method of extensibility would require spawning external processes, thus ensuring both security risks and unpredictable load.
The server would have no built-in mechanism for gracefully apportioning resources, refusing or delaying heavy traffic, or load-balancing. It would, however, be relatively easy to crash.
Multiple files traveling together from one server to one client would each incur the entire overhead of a new session call.
The hypertext model would ignore all serious theoretical work on hypertext to date. In particular, all hypertext links would be one-directional, thus making it impossible to move or delete a piece of data without ensuring that some unknown number of pointers around the world would silently fail.
The tag set would be absurdly polluted and user-extensible with no central coordination and no consistency in implementation. As a bonus, many elements would perform conflicting functions as logical and visual layout elements.
HTTP and HTML are the Whoopee Cushion and Joy Buzzer of Internet protocols, only comprehensible as elaborate practical jokes. For anyone who has tried to accomplish anything serious on the Web, it’s pretty obvious that of the various implementations of a worldwide hypertext protocol, we have the worst one possible.

Except, of course, for all the others.

MAMMALS VS. DINOSAURS

The problem with that list of deficiencies is that it is also a list of necessities — the Web has flourished in a way that no other networking protocol has except e-mail, not despite many of these qualities but because of them. The very weaknesses that make the Web so infuriating to serious practitioners also make it possible in the first place. In fact, had the Web been a strong and well-designed entity from its inception, it would have gone nowhere. As it enters its adolescence, showing both flashes of maturity and infuriating unreliability, it is worth recalling what the network was like before the Web.

In the early ’90s, Internet population was doubling annually, and the most serious work on new protocols was being done to solve the biggest problem of the day, the growth of available information resources at a rate that outstripped anyone’s ability to catalog or index them. The two big meta-indexing efforts of the time were Gopher, the anonymous ftp index; and the heavy-hitter, Thinking Machines’ Wide Area Information Server (WAIS). Each of these protocols was strong — carefully thought-out, painstakingly implemented, self-consistent and centrally designed. Each had the backing of serious academic research, and each was rapidly gaining adherents.

The electronic world in other quarters was filled with similar visions of strong, well-designed protocols — CD-ROMs, interactive TV, online services. Like Gopher and WAIS, each of these had the backing of significant industry players, including computer manufacturers, media powerhouses and outside investors, as well as a growing user base that seemed to presage a future of different protocols for different functions, particularly when it came to multimedia.

These various protocols and services shared two important characteristics: Each was pursuing a design that was internally cohesive, and each operated in a kind of hermetically sealed environment where it interacted not at all with its neighbors. These characteristics are really flip sides of the same coin — the strong internal cohesion of their design contributed directly to their lack of interoperability. CompuServe and AOL, two of the top online services, couldn’t even share resources with one another, much less somehow interoperate with interactive TV or CD-ROMs.

THE STRENGTH OF WEAKNESS AND EVOLVABILITY

In other words, every contender for becoming an “industry standard” for handling information was too strong and too well-designed to succeed outside its own narrow confines. So how did the Web manage to damage and, in some cases, destroy those contenders for the title of The Next Big Thing? Weakness, coupled with an ability to improve exponentially.

The Web, in its earliest conception, was nothing more than a series of pointers. It grew not out of a desire to be an electronic encyclopedia so much as an electronic Post-it note. The idea of keeping pointers to ftp sites, Gopher indices, Veronica search engines and so forth all in one place doesn’t seem so remarkable now, but in fact it was the one thing missing from the growing welter of different protocols, each of which was too strong to interoperate well with the others.

Considered in this light, the Web’s poorer engineering qualities seem not merely desirable but essential. Despite all strong theoretical models of hypertext requiring bi-directional links, in any heterogeneous system links have to be one-directional, because bi-directional links would require massive coordination in a way that would limit its scope. Despite the obvious advantages of persistent connections in terms of state-tracking and lowering overhead, a server designed to connect to various types of network resources can’t require persistent connections, because that would limit the protocols that could be pointed to by the Web. The server must accommodate external processes or it would limit its extensibility to whatever the designers of the server could put into any given release, and so on.

Furthermore, the Web’s almost babyish SGML syntax, so far from any serious computational framework (Where are the conditionals? Why is the Document Type Description so inconsistent? Why are the browsers enforcement of conformity so lax?), made it possible for anyone wanting a Web page to write one. The effects of this ease of implementation, as opposed to the difficulties of launching a Gopher index or making a CD-ROM, are twofold: a huge increase in truly pointless and stupid content soaking up bandwidth; and, as a direct result, a rush to find ways to compete with all the noise through the creation of interesting work. The quality of the best work on the Web today has not happened in spite of the mass of garbage out there, but in part because of it.

In the space of a few years, the Web took over indexing from Gopher, rendered CompuServe irrelevant, undermined CD-ROMs, and now seems poised to take on the features of interactive TV, not because of its initial excellence but because of its consistent evolvability. It’s easy for central planning to outperform weak but evolvable systems in the short run, but in the long run evolution always has the edge. The Web, jujitsu-like, initially took on the power of other network protocols by simply acting as pointers to them, and then slowly subsumed their functions.

Despite the Web’s ability to usurp the advantages of existing services, this is a story of inevitability, not of perfection. Yahoo and Lycos have taken over from Gopher and WAIS as our meta-indices, but the search engines themselves, as has been widely noted, are pretty lousy ways to find things. The problem that Gopher and WAIS set out to solve has not only not been solved by the Web, it has been made worse. Furthermore, this kind of problem is intractable because of the nature of evolvable systems.

THREE RULES FOR EVOLVABLE SYSTEMS

Evolvable systems — those that proceed not under the sole direction of one centralized design authority but by being adapted and extended in a thousand small ways in a thousand places at once — have three main characteristics that are germane to their eventual victories over strong, centrally designed protocols.

  • Only solutions that produce partial results when partially implemented can succeed. The network is littered with ideas that would have worked had everybody adopted them. Evolvable systems begin partially working right away and then grow, rather than needing to be perfected and frozen. Think VMS vs. Unix, cc:Mail vs. RFC-822, Token Ring vs. Ethernet.
  • What is, is wrong. Because evolvable systems have always been adapted to earlier conditions and are always being further adapted to present conditions, they are always behind the times. No evolving protocol is ever perfectly in sync with the challenges it faces.
  • Finally, Orgel’s Rule, named for the evolutionary biologist Leslie Orgel — “Evolution is cleverer than you are”. As with the list of the Web’s obvious deficiencies above, it is easy to point out what is wrong with any evolvable system at any point in its life. No one seeing Lotus Notes and the NCSA server side-by-side in 1994 could doubt that Lotus had the superior technology; ditto ActiveX vs. Java or Marimba vs. HTTP. However, the ability to understand what is missing at any given moment does not mean that one person or a small central group can design a better system in the long haul.

Centrally designed protocols start out strong and improve logarithmically. Evolvable protocols start out weak and improve exponentially. It’s dinosaurs vs. mammals, and the mammals win every time. The Web is not the perfect hypertext protocol, just the best one that’s also currently practical. Infrastructure built on evolvable protocols will always be partially incomplete, partially wrong and ultimately better designed than its competition.

LESSONS FOR THE FUTURE

And the Web is just a dress rehearsal. In the next five years, three enormous media — telephone, television and movies — are migrating to digital formats: Voice Over IP, High-Definition TV and Digital Video Disc, respectively. As with the Internet of the early ’90s, there is little coordination between these efforts, and a great deal of effort on the part of some of the companies involved to intentionally build in incompatibilities to maintain a cartel-like ability to avoid competition, such as DVD’s mutually incompatible standards for different continents.

And, like the early ’90s, there isn’t going to be any strong meta-protocol that pushes Voice Over IP, HDTV and DVD together. Instead, there will almost certainly be some weak ‘glue’ or ‘scaffold’ protocol, perhaps SMIL (Synchronized Multimedia Integration Language) or another XML variant, to allow anyone to put multimedia elements together and synch them up without asking anyone else’s permission. Think of a Web page with South Park in one window and a chat session in another, or The Horse Whisperer running on top with a simultaneous translation into Serbo-Croatian underneath, or clickable pictures of merchandise integrated with a salesperson using a Voice Over IP connection, ready to offer explanations or take orders.

In those cases, the creator of such a page hasn’t really done anything ‘new’, as all the contents of those pages exist as separate protocols. As with the early Web, the ‘glue’ protocol subsumes the other protocols and produces a kind of weak integration, but weak integration is better than no integration at all, and it is far easier to move from weak integration to strong integration than from none to some. In 5 years, DVD, HDTV, voice-over-IP, and Java will all be able to interoperate because of some new set of protocols which, like HTTP and HTML, is going to be weak, relatively unco-ordinated, imperfectly implemented and, in the end, invincible.