Broadcast Institutions, Community Values

This essay is an extension of a speech I gave at the BBC about the prospects for online community building by broadcast media. First published September 9, 2002 on the ‘Networks, Economics, and Culture’ mailing list.

There is a long history of businesses trying to harness the power of online communities for commercial ends. Most of these attempts have failed, for the obvious reasons. There are few products or services people care about in a way that would make them want to join a community, and when people are moved to speak out about a commercial offering, it is usually to complain.

Media organizations, however, would seem to be immune to these difficulties, because online media and online communities have the same output: words and images. Even here, though, there are significant obstacles to hosting community, obstacles peculiar to the nature of media. Much of the discipline a broadcast organization must internalize to do its job well are not merely irrelevant to community building, but actively harmful.

If you were a broadcast media outlet thinking about community building, here are five things you would think about:

1. Audiences are built. Communities grow.
2. Communities face a tradeoff between size and focus.
3. Participation matters more than quality.
4. You may own the software, but the community owns itself.
5. The community will want to build. Help it, or at least let it.

#1. Audiences are built. Communities grow.

Audiences are connected through broadcast. Everyone in the MSNBC audience sees MSNBC content broadcast outwards from the center. You can’t build a community this way, because the things that make a community worthwhile are provided by the members for one another, and cannot be replaced by things the hosting organization can offer. Communities are connected through what Greg Elin calls intercast — the communications that pass among and between interconnected members of a community.

Broadcast connections can be created by a central organization, but intercast connections are created by the members for one another. Communities grow, rather than being built. New members of an audience are simply added to the existing pool, but new members of a community must be integrated. Matt Jones uses the word “loam” to describe the kind of environment conducive to community formation. One of the most important things you can do to attract community is to give it a fertile environment in which to grow, and one of the most damaging things you can do is to try to force it to grow at a rapid pace or in a preset direction.

#2. Communities face a tradeoff between size and focus.

Communities are held together through intercast communications, but, to restate Metcalfe’s Law, the complexity of intercast grows faster than group size. This means that in an intercast world, uniformly dense interconnectedness becomes first hard and then impossible to support as a group grows large. The typical response for a growing community is to sub-divide, in either “soft” ways (overlapping social clusters) or “hard” ways (a church that splits into two congregations.)

Small groups can be highly focused on some particular issue or identity, but such groups can’t simply be inflated like a balloon, because a large group is a different kind of thing than a small one. Online groups that grow from small to large tend to lose their focus, as topic drift or factionalization appears.

Most broadcast organizations assume that reaching a large group is an unqualified good, so they push for size at any cost, and eventually bump into the attendant tradeoffs: you can have large community, but not a highly focused one; you can have a focused community, but not a large one; or you can reach a large number of people focused on a particular issue, but it won’t be a community.

With these options, broadcast organizations will (often unconsciously) opt for the last one, simply building an audience and calling it a community, as in “The community of our readers.” Though this may make for good press release material, calling your audience a community doesn’t actually make it one.

#3. Participation matters more than quality.

The order of things in broadcast is “filter, then publish.” The order in communities is “publish, then filter.” If you go to a dinner party, you don’t submit your potential comments to the hosts, so that they can tell you which ones are good enough to air before the group, but this is how broadcast works every day. Writers submit their stories in advance, to be edited or rejected before the public ever sees them. Participants in a community, by contrast, say what they have to say, and the good is sorted from the mediocre after the fact.

Media people often criticize the content on the internet for being unedited, because everywhere one looks, there is low quality — bad writing, ugly images, poor design. What they fail to understand is that the internet is strongly edited, but the editorial judgment is applied at the edges, not the center, and it is applied after the fact, not in advance. Google edits web pages by aggregating user judgment about them, Slashdot edits posts by letting readers rate them, and of course users edit all the time, by choosing what (and who) to read.

Anyone who has ever subscribed to a high-volume mailing list knows there are people who are always worth reading, and people who are usually worth ignoring. This is a way of raising the quality of what gets read, without needing to control what gets written. Media outlets that try to set minimum standards of quality in community writing often end up squeezing the life out of the discussion, because they are so accustomed to filtering before publishing that they can’t imagine that filtering after the fact can be effective.

#4. You may own the software, but the community owns itself.

The relationship between the owner of community software and the community itself is like the relationship between a landlord and his or her tenants. The landlord owns the building, and the tenants take on certain responsibilities by living there. However, the landlord does not own the tenants themselves, nor their relations to one another. If you told tenants of yours that you expected to sit in on their dinner table conversation, they would revolt, and, as many organizations have found, the same reaction occurs in online communities.

Community is made possible by software, but the value is created by its participants. If you think of yourself as owning a community when you merely own the infrastructure, you will be astonished at the vitriol you will face if you try to force that community into or out of certain behaviors.

#5. The community will want to build. Help it, or at least let it.

Healthy communities modify their environment. One of the surprises in the design of software that supports community is that successful innovations are often quite shallow. We have had the necessary technology to build weblogs since 1994, but weblogs themselves didn’t take off until 5 years later, not because the deep technology wasn’t there, but because the shallow technology wasn’t. Weblogs are primarily innovations in interface, and, as importantly, innovations in the attitudes of the users.

Because communal innovation often hinges as much on agreements among users as protocols among machines, communities can alter their environments without altering the underlying technology. If you spend any time looking at LiveJournal (one of the best overall examples of good community engineering) you will see periodic epidemics of the “Which type of pasta are you”-style quizzes. (“You’re fusilli — short and twisted.”) The quizzes are not hosted on LiveJournal servers, but they have become part of the LiveJournal community.

If LiveJournal had decided to create a complete, closed experience, they could have easily blocked those quizzes. However, they didn’t mistake owning the database for owning the users (see #4 above), so they let the users import capabilities from elsewhere. The result is that the community’s connection to LiveJournal is strengthened, not weakened, because over time the environment becomes fitted to the community that uses it, even though nothing in the software itself changes.

Hard Work

If you want to host a community online, don’t kid yourself into believing that giving reporters weblogs and calling the reader comments “community” is the same as the real thing. Weblogs operate on a spectrum from media outlet (e.g. InstaPundit) to communal conversation (e.g. LiveJournal), but most weblogs are much more broadcast than intercast. Likewise, most comments are write-only replies to the original post in the manner of Letters to the Editor, rather than real conversations among the users. This doesn’t mean that broadcast weblogs or user comments are bad; they just don’t add up to a community.

Real community is a self-creating thing, with some magic spark, easy to recognize after the fact but impossible to produce on demand, that draws people together. Once those people have formed a community, however, they will act in the interests of the community, even if those aren’t your interests. You need to be prepared for this.

The hallmark of a successful community is that it achieves some sort of homeostasis, the ability to maintain an internal equilibrium in the face of external perturbations. One surprise is that if a community forms on a site you host, they may well treat you, the owner of the site, as an external perturbation. Another surprise is that they will treat growth as a perturbation as well, and they will spontaneously erect barriers to that growth if they feel threatened by it. They will flame and troll and otherwise make it difficult for potential new members to join, and they will invent in-jokes and jargon that makes the conversation unintelligible to outsiders, as a way of raising the bar for membership.

This does not mean that hosting community is never worthwhile — the communal aspects of sites like Slashdot and Kuro5hin are a critical source of their value. It just means that it is hard work, and will require different skills and attitudes than those necessary to run a good broadcast site. Many of the expectations you make about the size, composition, and behavior of audiences when you are in a broadcast mode are actually damaging to community growth. To create an environment conducive to real community, you will have to operate more like a gardener than an architect.

Half the World

Version 1.03 | September 3, 2002

A good deal has been written about the digital divide, the technological gap that exists between the developed and developing world. If you wanted a striking illustration of the problem, you could turn to Thabo Mbeki’s speech at the Information Society and Development Conference in 1996, where he told the delegates “Half of humanity has not yet made a phone call.”

Or, if you prefer, Kofi Annan’s 2000 speech to the Australian Press Club, where he said “Half the world’s population has never made or received a phone call.” Or Thomas Homer-Dixon’s version, from a speech in 2001: “…half the people on the planet have never made a phone call.” Greg LeVert of MCI said it at a telecom conference in 1994; Richard Klugman of PaineWebber said it in The Economist in 1995; Jeffery Sachs and Al Gore both said it in 1998; Reed Hundt and Steve Case both said it 2000; Michael Moore and Newt Gingrich both said it in 2001, as did Carly Fiorina and Melinda Gates; and not content with merely half, Tatang Razak of Indonesia told the UN’s Committee on Information “After all, most of the people in the world have never made a phone call…”, in speech from April of this year.

The phrase “Half the world has never made a phone call” or some variation thereof has become an urban legend, a widely believed but unsubstantiated story about the nature of the world. It has appeared countless times over the last decade, in essentially the same form and always without attribution. Where did that phrase come from? How did it take on such a life of its own? And, most importantly, why has it gotten so much airtime in the debate over the digital divide when it is so obviously wrong?

You Can’t Keep A Good Factoid Down 

The Phrase is such a serious and important statistic that only the boorish would question its accuracy. There is a kind of magical resonance in advancing arguments on behalf of half the world’s people, and it allows the speaker to chide the listener for harboring any lingering techno-optimism (“You think there’s a revolution going on? Half the world has never made a phone call!”) The Phrase establishes telecommunications access as a Big Problem and, by extension, validates the speaker as a Thinker-About-Big-Problems. 

But saying “Half the world has never made a phone call” makes no more sense than saying “My car goes from 0 to 60” or “It rained 15 inches.” Without including the element of time, you cannot talk about rate, and it is rate that matters in dynamic systems. Half the world had never made a phone call on what date? And what has the rate of telecom growth been since that date? Because it is that calculation and only that calculation which could tell us anything important about the digital divide.

Static Statements About Dynamic Systems 

Virginia Postrel, in her book “The Future and Its Enemies”, (ISBN: 0684862697) suggests that the old distinctions of right and left are now less important than a distinction between stasists and dynamists. Stasists are people who believe that the world either is or should be a controlled, predictable place. Dynamists, by contrast, see the world as a set of dynamic processes. This distinction is the key to The Phrase. Anyone who uses it is affirming the stasist point of view, even if unconsciously, because they are treating telecommunications infrastructure as if it were frozen in time.

To think about rate, we need three things: original time, elapsed time, and speed of change. The figure first appeared in print in late 1994, when the Toronto Sun quoted it as part of Greg LeVert’s speech at TeleCon ’94. (Mr. LeVert was no stasist — though he seems to have accidentally bequeathed us The Phrase, no one now remembers that he introduced it as a way of dramatizing the magnitude of the coming change, and went on to predict a billion new phones by 2000.) Therefore we can restore the question of rate by asking what has happened to the number of telephones in the world between the beginning of 1995 and now.

Restoring Rate

The ITU estimates that there were approximately 689 million land lines at the beginning of 1995, and a little over 1 billion by the end of 2000, the last year for which they have figures available. This is an average annual growth rate of just over 7%, and a cumulative improvement in that period of over 50%, meaning that the first two-thirds of the world’s phone lines were run between 1876 and 1994, and the remaining third were run between 1995 and 2000. Put another way, half again as many land lines were run in the last 6 years of the 20th century as were run in the whole previous history of telephony. So much for stasis. 

Of course not all of this growth touches the problem at hand — a new phone line in a teenager’s room may increase several sorts of telecom statistics, but the number of people making their first phone call isn’t one of them. Since The Phrase concerns the digital divide, we should concentrate on telecom growth in the less developed world. 

From the beginning of 1995 to the end of 2000, 8 countries achieved compound average growth rates of 25% or more for land lines per 100 people (against a world average of 7%), meaning they at least tripled the number of land lines over the whole period. They were, in order of rate, Sudan (which improved six-fold), Albania, China, Sri Lanka, Viet Nam, Ghana, Nepal, and Cambodia — not exactly the G8. China alone went from 41 million land lines to 179 million in those 6 years. And there were 35 additional countries, including India, Indonesia, and Brazil, with annual growth of between 10 and 20% from 1995 to 2000, meaning they at least doubled the number of land lines in that period.

And mobile telephony makes the change in land lines look tectonic. In 1995, there were roughly 91 million cellular subscribers. By 2000, the number had risen to 946 million, a ten-fold increase. Twenty-seven countries had growth rates of over 100% annually, meaning that, at a minimum, they doubled and doubled again, 6 times, achieving better than sixty-fold cumulative growth (not 60%, but a factor of 60), and 22 of those had better than hundred-fold growth. Senegal went from around 100 subscribers (not 100 thousand subscribers, 100 subscribers) to 390 thousand. Egypt went from 7 thousand to almost 3 million. Romania went from 9 thousand to almost 4 million. An additional 44 nations with no measurable wireless penetration in 1995 had acquired wireless subscribers by 2001.

Because wireless infrastructure does not require painstaking building-by-building connectivity, nor is it as hampered by state-owned monopolies, it offers a way for a country to increase its telephone penetration extremely quickly. By the end of 2000, there were 25 countries where cell phone users made up between two-thirds and nine-tenths of the connected populace. In these countries, none of them wealthy, a new telecommunications infrastructure was deployed from scratch, during the same years that keynote speakers and commencement invitees were busily and erroneously informing their listeners that half the world had never made a phone call.

Two Answers

So, in 2002, what can we conclude about the percentage of the world that has made a phone call?

The first, and less important answer to that question goes like this: Between 1995 and 2000, the world’s population rose by about 8%. Meanwhile, the number of land lines rose by 50%, and the number of cellular subscribers by over 1000%. Contrary to the hopelessness conveyed by The Phrase, telephone penetration is growing much faster than population. It is also growing faster in the developing world than in the developed world. Outside the OECD, growth was about 130% for land lines and over 2,300% for cellular phones — 14 million subscribers at the beginning of 1995 and 342 million by the end of 2000. If we assume that LeVert’s original guess of half was right in 1994 (a big if), the new figure would be “Around two-thirds and still rising.”

There is another answer to that question though, which is much more important: It doesn’t matter. No snapshot of telephone penetration matters, because the issue is not amount but rate. If you care about the digital divide, and you believe that access to communications can help poor countries to grow, then pontificating about who has or hasn’t made a phone call is worse than a waste of time, it actively distorts your view of the possible solutions because it emphasizes a stasist attitude. 

Though a one-time improvement of 5% is better in the short run than a change that improves annual growth by 1%, the latter solution is better in the medium run, and much better in the long run. As as everything from investment theory to Moore’s Law has shown, it’s hard to beat compound growth, and improving compound growth in the spread of telephones requires reducing the barriers between demand and supply. Some countries have had enormously favorable rates of growth in the last decade, so we should ask what has gone right in those countries. It turns out that money is less important than you might expect. 

Examples from the Real World

In 1995, Brunei had almost twice as many land lines and 50 times as many cell phones per capita as Poland, not to mention more than twice the per capita GDP. By the end of 2000, Poland exceeded Brunei in land lines, and had even matched it in cell phone penetration. Brunei is smaller, easier to wire, and much much richer, but Poland has something that all Brunei’s money can’t buy — a dynamic economy. Ireland similarly outgrew Denmark and Egypt outgrew Kuwait, even though the faster growing countries were the poorer ones.

The Democratic Republic of Congo lost 16 thousand of its 36 thousand fixed lines but grew overall, because it added 140 thousand cell phones. Its unsurprising that people would abandon the state as a provider of telephones in a country riven by civil war. What is surprising is that Venezuela had the same pattern. Venezuela, with a monopoly telecom provider, saw its per capita penetration of land lines fall by 0.3% annually, while cell phone use exploded. In both cases, the state was an obstacle to telephone usage, but the presence of a private alternative meant that telephone penetration could nevertheless increase.

Bangladesh, with a per capita GDP of $1,570, has had annual cellular growth of nearly 150%, in part because of programs like the Grameen Bank’s Village Phone, which loans a phone, collateral free, to women in Bangladeshi villages, who in turn resell the service to their neighbors. This is double leverage, as it not only increases the number of phones in use, but also increases the number of users per phone.

These examples demonstrate what makes The Phrase so pernicious. Something incredibly good is happening in parts of the world with dynamic economies, and that is what people concerned with the digital divide should be thinking about. If the world’s poor are to be served by better telecommunications infrastructure, there are obvious things to be done. Make sure individuals have access to a market for telephone service. Privatize state telecom companies, introduce competition, and reduce corruption. And perhaps most importantly, help stamp out static thinking about telecommunications wherever it appears. Economic dynamism is a far better tool for improving telephone use than any amount of erroneous and incomplete assertions on behalf of half the world’s population, because while The Phrase has remained static for the last decade or so, the world hasn’t.


Version History: This is version 1.03 of this essay, from September 3, 2002. Version 1.0 was published June 30, 2002, 1.01 appeared on July 3, and 1.02 appeared July 19th.

Changes from 1.0 to 1.01: I realized after publishing the original essay that the ITU statistics run from January 1 to January 1, meaning that the 1995-2001 period is 6 elapsed years, not 7. To make this clear, I re-wrote the sections “Restoring Rate”, “Two Answers”, and the first paragraph of “Lessons from the Real World” to refer to “the end of 2000” or “2000” as the stop date for the growth figures, and to list the elapsed time as 6 years.

This means that the changes described here happend in less time, i.e. at a faster rate, than I originally suggested. The only place it materially affects the numerical conclusions is in 30 countries which had in excess of 100% annual growth. Only 26 of these countries had 100-fold growth (i.e. a compound average growth rate of above 116%). The remaining 4 grew between 64 and 100-fold.

Changes from 1.01 to 1.02: Re-wrote the second paragraph. I had incorrectly located Kofi Annan’s speech at the ITU, rather than at the Australian Press club (though he has said it in several other venues as well, including his Millenium Report, and the World Economic Forum in Davos in 2001.) Fiorina’s use of The Phrase dates from 2001, not 1999. I also added the references to Steve Case and Melinda Gates. The paragraph in its original form can be found at the link to version 1.0, above.

Changes from 1.02 to 1.03: Noted that the 8 countries listed as having the highest percentage of wired telecom growth, in the paragraph beginning “From the beginning of 1995 to the end of 2000…”, all had growth rates in excess of 25%, not merely 20%. (Thanks to Jennifer Weaver of Wired, who edited a version of this article to appear there.)

Wired telecom penetration in the developing world 2001 was 230% of what it was in 1995, meaning it grew 130%. Changed the 230% figure to 130% in the paragraph beginning “The first, and less important answer to that question…” to reflect the growth rate, rather than absolute penetration. Likewise changed the 2,400% figure for wireless to 2,300%, for the same reason.

Changed the wireless figures, in the paragraph beginning “And mobile telephony makes the change in land lines…” to include the 44 countries that created divide-by-zero errors when using the ITU statistics, because their measurable wireless penetration was 0 in 1995.


NOTES: The statistics on teledensity used here are drawn from the International Telecommunication Union (www.itu.int/). The statistics page is at www.itu.int/ITU-D/ict/statistics/, and the documents concerning compound and overall growth in main telephone lines and cellular subscribers between 1995 and 2001 are atwww.itu.int/ITU-D/ict/statistics/at_glance/main01.pdf and www.itu.int/ITU-D/ict/statistics/at_glance/cellular01.pdfrespectively.

The estimates for the developing world were derived by treating the membership of the Organization for Economic Co-operation and Development (OECD, www.oecd.org) as a proxy for the developed world. Growth in the developing world was then derived by recalculating the totals from the ITU documents after removing the 30 countries in the OECD.

The figure for population growth was derived from the US Census Bureau’s estimates of world population in 1995 and 2001. www.census.gov/ipc/www/worldpop.html

The original Toronto Sun piece, from the Business section of October 13th of 1994) read:

So you think the world is wired? Half the world's population — an astounding three billion people -- has never made a phone call, a telecommunications conference was told Wednesday.
  
"Most people on Earth live more than two hours from a telephone," Greg LeVert, president of U.S. giant MCI's Integrated Client Services Division, told delegates to TeleCon '94 in Toronto.
  
Things are changing fast, though.
  
"Nearly a billion more people will have access to a telephone by 2000," LeVert said.

Domain Names: Memorable, Global, Non-political?

Everyone understands that something happened to the domain name system in the mid-90s to turn it into a political minefield, with domain name squatters and trademark lawsuits and all the rest of it. It’s tempting to believe that if we could identify that something and reverse it, we could return to the relatively placid days prior to ICANN.

Unfortunately, what made domain names contentious was simply that the internet became important, and there’s no putting the genie back in that bottle. The legal issues involved actually predate not only ICANN but the DNS itself, going back to the mid-70s and the earliest decision to create memorable aliases for unmemorable IP addresses. Once the original host name system was in place — IBM.com instead of 129.42.18.99 — the system was potentially subject to trademark litigation. The legal issues were thus implicit in the DNS from the day it launched; it just took a decade or so for anyone to care enough to hire a lawyer.

There is no easy way to undo this. The fact that ICANN is a political body is not their fault (though the kind of political institution it has become is their fault.) Memorable names create trademark issues. Global namespace requires global oversight. Names that are both memorable and globally unique will therefore require global political oversight. As long as we want names we can remember, and which work unambiguously anywhere in the world, someone somewhere will have to handle the issues that ICANN currently handles.

Safety in Numbers

One reaction to the inevitable legal trouble with memorable names is simply to do away with memorable names. In this scenario, ICANN would only be responsible for assigning handles, unique IDs devoid of any real meaning. (The most articulate of these proposals is Bob Frankston’s “Safe Haven” approach.) [http://www.frankston.com/public/essays/DNSSafeHaven.asp]

In practice, this would mean giving a web site a meaningless but unique numerical address. Like a domain name today, it would be globally unambiguous, but unlike today’s domain names, such an address would not be memorable, as people are bad at remembering numbers, and terrible at remembering long numbers.

Though this is a good way to produce URLs free from trademark, we don’t need a new domain to do this. Anyone can register unmemorable numeric URLs today — whois says 294753904578.com, for example, is currently available. Since this is already possible, such a system wouldn’t free us from trademark issues, because whenever systems with numerical addresses grow popular (e.g. Compuserve or ICQ), users demand memorable aliases, to avoid dealing with horrible addresses like 71234.5671@compuserve.com. Likewise, the DNS was designed to manage memorable names, not merely unique handles, and creating a set of non-memorable handles simply moves the issue of memorable names to a different part of the system. It doesn’t make the issue go away.

Embrace Ambiguity

Another set of proposals would do away with globally unique aspect of domain names. Instead of awarding a single firm the coveted .com address, a search for ACME would yield several different matches, which the user would then pick from. This is analogous to a Google search on ACME, but one where none of the matches had a memorable address of their own.

The ambiguity in such a system would make it impossible to automate business-to-business connections using the names of the businesses themselves. These addresses would also fail the ‘side of the bus’ test, where a user seeing a simple address like IBM.com on a bus or a business card (or hearing it over the phone or the radio) could go to a browser and type it in. Instead, there would be a market for third-parties who resolve name->address mappings.

The rise of peer-to-peer networks has given us a test-bed for market-allocated namespaces, and the news isn’t good. Despite the obvious value in having a single interoperable system for instant messaging, to take one example, we don’t have interoperability because AOL is (unsurprisingly) unwilling to abandon the value in owning the majority of those addresses. The winner in a post-DNS market would potentially have even more control and less accountability than ICANN does today.

Names as a Public Good

The two best theories of network value we have — Metcalfe’s law for point-to-point networks and Reed’s law for group-forming networks — both rely on optionality, the possibility actually creating any of the untold potential connections that might exist on large networks. Valuable networks allow nodes to connect to one another without significant transaction costs.

Otherwise identical networks will thus have very different values for their users, depending on how easy or hard it is to form connections. In this theory, the worst damage spam does is not in wasting individual user’s time, but in making users skeptical of all mail from unknown sources, thus greatly reducing the possibility of unlikely connections. (What if you got real mail from Nigeria?)

Likewise, a system that provides a global namespace, managed as a public good, will create enormous value in a network, because it will lower the transaction costs of establishing a connection or group globally. It will also aid innovation by allowing new applications to bootstrap into an existing namespace without needing explicit coordination or permission. Despite its flaws, and despite ICANN’s deteriorating stewardship, this is what the DNS currently does.

Names Are Inevitable

We make sense of the world by naming things. Faced with any sort of numerical complexity, humans require tools for oversimplifying, and names are one of the best oversimplifications we have. We have only recently created systems that require global namespaces (ship registries, telephone numbers) so we’re not very good at it yet. In most of those cases, we have used existing national entities to guarantee uniqueness — we get globally unique phone numbers if we have nationally unique phone numbers and globally unique country codes.

The DNS, and the internet itself, have broken this ‘National Partition’ solution because they derive so much of their value from being so effortlessly global. There are still serious technical issues with the DNS, such as the need for domain names in non-English character sets, as well as serious political issues, like the need for hundreds if not thousands of new top-level domains. However, it would be hard to overstate the value created by memorable and globally unique domain names, names that are accessible to any application without requiring advance coordination, and which lower the transaction costs for making connections.

There are no pure engineering solutions here, because this is not a pure engineering problem. Human interest in names is a deeply wired characteristic, and it creates political and legal issues because names are genuinely important. In the 4 years since its founding, ICANN has moved from being merely unaccountable to being actively anti-democratic, but as reforming or replacing ICANN becomes an urgent problem, we need to face the dilemma implicit in namespaces generally: Memorable, Global, Non-political — pick two.

Communities, Audiences, and Scale

April 6, 2002

Prior to the internet, the differences in communication between community and audience was largely enforced by media — telephones were good for one-to-one conversations but bad for reaching large numbers quickly, while TV had the inverse set of characteristics. The internet bridged that divide, by providing a single medium that could be used to address either communities or audiences. Email can be used for conversations or broadcast, usenet newsgroups can support either group conversation or the broadcast of common documents, and so on. Most recently the rise of software for “The Writable Web”, principally weblogs, is adding two-way features to the Web’s largely one-way publishing model.

With such software, the obvious question is “Can we get the best of both worlds? Can we have a medium that spreads messages to a large audience, but also allows all the members of that audience to engage with one another like a single community?” The answer seems to be “No.”

Communities are different than audiences in fundamental human ways, not merely technological ones. You cannot simply transform an audience into a community with technology, because they assume very different relationships between the sender and receiver of messages.

Though both are held together in some way by communication, an audience is typified by a one-way relationship between sender and receiver, and by the disconnection of its members from one another — a one-to-many pattern. In a community, by contrast, people typically send and receive messages, and the members of a community are connected to one another, not just to some central outlet — a many-to-many pattern [1]. The extreme positions for the two patterns might be visualized as a broadcast star where all the interaction is one-way from center to edge, vs. a ring where everyone is directly connected to everyone else without requiring a central hub.

As a result of these differences, communities have strong upper limits on size, while audiences can grow arbitrarily large. Put another way, the larger a group held together by communication grows, the more it must become like an audience — largely disconnected and held together by communication traveling from center to edge — because increasing the number of people in a group weakens communal connection. 

The characteristics we associate with mass media are as much a product of the mass as the media. Because growth in group size alone is enough to turn a community into an audience, social software, no matter what its design, will never be able to create a group that is both large and densely interconnected. 

Community Topology

This barrier to the growth of a single community is caused by the collision of social limits with the math of large groups: As group size grows, the number of connections required between people in the group exceeds human capacity to make or keep track of them all.

A community’s members are interconnected, and a community in its extreme position is a “complete” network, where every connection that can be made is made. (Bob knows Carol, Ted, and Alice; Carol knows Bob, Ted, and Alice; and so on.) Dense interconnection is obviously the source of a community’s value, but it also increases the effort that must be expended as the group grows. You can’t join a community without entering into some sort of mutual relationship with at least some of its members, but because more members requires more connections, these coordination costs increase with group size.

For a new member to connect to an existing group in a complete fashion requires as many new connections as there are group members, so joining a community that has 5 members is much simpler than joining a community that has 50 members. Furthermore, this tradeoff between size and the ease of adding new members exists even if the group is not completely interconnected; maintaining any given density of connectedness becomes much harder as group size grows. As new members join, it creates either more effort or lowers the density of connectedness, or both, thus jeopardizing the interconnection that makes for community. [2]

As group size grows past any individual’s ability to maintain connections to all members of a group, the density shrinks, and as the group grows very large (>10,000) the number of actual connections drops to less than 1% of the potential connections, even if each member of the group knows dozens of other members. Thus growth in size is enough to alter the fabric of connection that makes a community work. (Anyone who has seen a discussion group or mailing list grow quickly is familiar with this phenomenon.)

An audience, by contrast, has a very sparse set of connections, and requires no mutuality between members. Thus an audience has no coordination costs associated with growth, because each new member of an audience creates only a single one-way connection. You need to know Yahoo’s address to join the Yahoo audience, but neither Yahoo nor any of its other users need to know anything about you. The disconnected quality of an audience that makes it possible for them to grow much (much) larger than a connected community can, because an audience can always exist at the minimum number of required connection (N connections for N users).

The Emergence of Audiences in Two-way Media

Prior to the internet, the outbound quality of mass media could be ascribed to technical limits — TV had a one-way relationship to its audience because TV was a one-way medium. The growth of two-way media, however, shows that the audience pattern re-establishes itself in one way or another — large mailing lists become read-only, online communities (eg. LambdaMOO, WELL, ECHO) eventually see their members agitate to stem the tide of newcomers, users of sites like slashdot see fewer of their posts accepted. [3]

If real group engagement is limited to groups numbering in the hundreds or even the thousands [4], then the asymmetry and disconnection that characterizes an audience will automatically appear as a group of people grows in size, as many-to-many becomes few-to-many and most of the communication passes from center to edge, not edge to center or edge to edge. Furthermore, the larger the group, the more significant this asymmetry and disconnection will become: any mailing list or weblog with 10,000 readers will be very sparsely connected, no matter how it is organized. (This sparse organization of the larger group can of course encompass smaller, more densely clustered communities.)

More Is Different

Meanwhile, there are 500 million people on the net, and the population is still growing. Anyone who wants to reach even ten thousand of those people will not know most of them, nor will most of them know one another. The community model is good for spreading messages through a relatively small and tight knit group, but bad for reaching a large and dispersed group, because the tradeoff between size and connectedness dampens message spread well below the numbers that can be addressed as an audience.

It’s significant that the only two examples we have of truly massive community spread of messages on the internet — email hoaxes and Outlook viruses — rely on disabling the users’ disinclination to forward widely, either by a social or technological trick. When something like All Your Base or OddTodd bursts on the scene, the moment of its arrival comes not when it spreads laterally from community to community, but when that lateral spread attracts the attention of a media outlet [5].

No matter what the technology, large groups are different than small groups, because they create social pressures against community organization that can’t be trivially overcome. This is a pattern we have seen often, with mailing lists, BBSes, MUDs, usenet, and most recently with weblogs, the majority of which reach small and tightly knit groups, while a handful reach audiences numbering in the tens or even hundreds of thousands (e.g. andrewsullivan.com.)

The inability of a single engaged community to grow past a certain size, irrespective of the technology, will mean that over time, barriers to community scale will cause a separation between media outlets that embrace the community model and stay small, and those that adopt the publishing model in order to accommodate growth. This is not to say that all media that address ten thousand or more people at once are identical; having a Letters to the Editor column changes a newspaper’s relationship to its audience, even though most readers never write, most letters don’t get published, and most readers don’t read every letter.

Though it is tempting to think that we can somehow do away with the effects of mass media with new technology, the difficulty of reaching millions or even tens of thousands of people one community at a time is as much about human wiring as it is about network wiring. No matter how community minded a media outlet is, needing to reach a large group of people creates asymmetry and disconnection among that group — turns them into an audience, in other words — and there is no easy technological fix for that problem. 

Like the leavening effects of Letters to the Editor, one of the design challenges for social software is in allowing groups to grow past the limitations of a single, densely interconnected community while preserving some possibility of shared purpose or participation, even though most members of that group will never actually interact with one another.


Footnotes

1. Defining community as a communicating group risks circularity by ignoring other, more passive uses of the term, as with “the community of retirees.” Though there are several valid definitions of community that point to shared but latent characteristics, there is really no other word that describes a group of people actively engaged in some shared conversation or task, and infelicitous turns of phrase like ‘engaged communicative group’ are more narrowly accurate, but fail to capture the communal feeling that arises out of such engagement. For this analysis, ‘community’ is used as a term of art to refer to groups whose members actively communicate with one another. [Return]

2. The total number of possible connections in a group grows quadratically, because each member of a group must connect to every other member but themselves. In general, therefore, a group with N members has N x (N-1) connections, which is the same as N2 – N. If Carol and Ted knowing one another count as a single relationship, there are half as many relationships as connections, so the relevant number is (N2 – N)/2.

Because these numbers grow quadratically, every 10-fold increase in group size creates a 100-fold increase in possible connections; a group of ten has about a hundred possible connections (and half as many two-way relationships), a group of a hundred has about ten thousand connections, a thousand has about a million, and so on. The number of potential connections in a group passes a billion as group size grows past thirty thousand. [Return]

3. Slashdot is suffering from one of the common effects of community growth — the uprising of users objecting to the control the editors exert over the site. Much of the commentary on this issue, both at slashdot and on similar sites such as kuro5hin, revolves around the twin themes of understanding that the owners and operators of slashdot can do whatever they like with the site, coupled with a surprisingly emotional sense of betrayal that the community control, in the form of moderation. 

(More at kuro5hin and slashdot. [Return]

4. In Grooming, Gossip, and the Evolution of Language (ISBN 0674363361), the primatologist Robin Dunbar argues that humans are adapted for social group sizes of around 150 or less, a size that shows up in a number of traditional societies, as well as in present day groups such as the Hutterite religious communities. Dunbar argues that the human brain is optimized for keeping track of social relationships in groups small than 150, but not larger. [Return]

5. In The Tipping Point (ISBN 0316346624), Malcolm Gladwell detailed the surprising spread of Hush Puppies shoes in the mid the ’90s, from their adoption by a group of cool kids in the East Village to a national phenomenon. The breakout moment came when Hush Puppies were adopted by fashion designers, with one designer going so far as to place a 25 foot inflatable Hush Puppy mascot on the roof of his boutique in LA. The cool kids got the attention of the fashion designers, but it was the fashion designers who got the attention of the world, by taking Hush Puppies beyond the communities in which it started and spreading them outwards to an audience that looked to the designers. [Return]

The Java Renaissance

06/12/2001

Java, the programming language created by Sun Microsystems to run on any operating system, was supposed to make it possible to write programs anywhere and to post them online for PC users to download and run instantly. Java was supposed to mean computer users wouldn’t have to choose between the Macintosh and Microsoft version of a program-and upgrading would be as simple as a mouse click. The idea, called “write once, run anywhere,” was a promise Java has not lived up to.

Java never ran as smoothly on PCs as Microsoft-haters hoped. Buggy versions of the Java engine in Netscape and Microsoft’s Internet Explorer, the difficulty of writing a good user interface in Java, and Microsoft’s efforts to deflect the threat of platform-independent software all contributed. Consequently, only a limited number of PC programs were written in Java. The current wisdom: Java is a great language for application and database servers, where it’s terrific at integrating functions across several different computers, but it’s dead on the desktop.

Which makes the current renaissance of Java programming for the PC all the more surprising.

A number of peer-to-peer companies, such as Roku Technologies (file synching and sharing), Parabon Computation (distributed computing), and OpenCola (content searching, bandwidth optimization), are writing applications in Java. These are young companies, and it is not clear whether they will be able to overcome either Java’s earlier limitations on the PC or Microsoft’s inevitable resistance. But their
willingness to try tells us much about software engineering, and about the PC’s place in computing ecology.

The most obvious explanation for this renaissance is the growing quality of Java itself. Sun made a big bet on Java and stuck with it even when Java failed to live up to its advance billing. The current implementation, Java 1.3, is a huge step in maturity for the language, and third parties such as IBM are making Java faster and more reliable.

This is not to say that all of Java’s weaknesses have been overcome. Writing an
interface in Java is still a wretched experience. Many programmers simply bypass
Java and write interfaces in HTML, a maneuver that allows them to change the interface without altering the underlying engineering.

Java is mainly returning to the PC, though, because the PC itself is becoming a
server. The companies coding in Java are all creating distributed, network-aware
applications, and Java’s value as a server language makes it an obvious choice for
the PC’s new role. Java is unparalleled as a language for distributed applications
because it was built around Internet protocols, rather than bolting them on, and is more secure than firewalls alone when a networked machine needs to access remote
resources or “share” resources remotely.

For all its problems, Java is still the leader in cross-device interoperability, running on everything from servers to cell phones and set-tops. If a programmer wants to write code to run on multiple devices, the only other choice on the horizon is Microsoft’s promised .NET architecture, which is still a long way off.

It’s too early to handicap the success of Java for PC-as-server applications. Microsoft could stop distributing Java with Internet Explorer, cross-device code may turn out to be less important than cross-device data formats, and the improvements in Java’s speed and stability may not be enough to please users.

Nevertheless, the return of Java is more evidence that the difference between client and server is increasingly blurry. You can get server-class hardware under your desk for $1,000 and high-speed access for 50 bucks a month, and as Napster and Seti@home have shown, users will eagerly sign up for services that put those
capabilities to use.

Furthermore, all applications are now network applications; Microsoft is even
rewriting Office to be network aware via the .NET initiative. In this environment,
anyone who can offer ways to write distributed applications that can operate over
the network while remaining secure will earn the respect of the developer community.

It’s not clear whether Java will finally fulfill its promise. But its surprising return to the PC shows that developers are hungry for a language that helps them deal with the opportunities and problems the Internet is creating. For all its faults, Java is still the best attempt at creating a cross-platform framework, and the success or failure of these young companies will tell us a lot about the future of software in our increasingly networked world.

Enter the Decentralized Zone

Digital security is a trade-off. If securing digital data were the only concern a business had, users would have no control over their own computing environment at all-the Web would be forbidden territory; every disk drive would be welded shut. That doesn’t happen, of course, because workers also need the flexibility to communicate with one another and with the outside world. The current compromise between security and flexibility is a sort of intranet-plus- firewall sandbox, where the IT department sets the security policies that workers live within. This allows workers a measure of freedom and flexibility while giving their companies heightened security. That was the idea, anyway. In practice, the sandbox model is broken. Some of the problem is technological, of course, but most of the problem is human. The model is broken because the IT department isn’t rewarded for helping workers do new things, but for keeping existing things from breaking. Workers who want to do new things are slowly taking control of networking, and this movement toward decentralized control cannot be reversed. The most obvious evidence of the gap between the workers’ view of the world and the IT department’s is in the proliferation of email viruses. When faced with the I Love You virus and its cousins, the information technology department lectures users against opening attachments. Making such an absurd suggestion only underlines how out of touch the IT group is: If you’re not going to open attachments, you may as well not show up for work. Email viruses are plaguing the workplace because users must open attachments to get their jobs done- the IT department has not given them another way to exchange files. For all the talk of intranets and extranets, the only simple, general-purpose tool for moving files between users, especially users outside the corporation, is email. Faced with an IT department that thinks not opening attachments is a reasonable option, end users have done the only sensible thing: ignore the IT department. Email was just the beginning. The Web has created an ever-widening hole in the sandbox. Once firewalls were opened up to the Web, other kinds of services like streaming media began arriving through the same hole, called port 80. Now that workers have won access to the Web through port 80, it has become the front door to a whole host of services, including file sharing. And now there’s ICQ. At least the IT folks knew the Web was coming-in many cases, they even installed the browsers themselves. ICQ (and its instant messaging brethren) is something else entirely-the first widely adopted piece of business software that no CTO evaluated and no administrator installed. Any worker who would ever have gone to the boss and asked for something that allowed them to trade real-time messages with anyone on the Net would have been turned down flat. So they didn’t ask, they just did it, and now it can’t be undone. Shutting off instant messaging is not an option. The flood is coming. And those three holes- email for file transfer, port 80 drilled through the firewall, and business applications that workers can download and install themselves-are still only cracks in the dike. The real flood is coming, with companies such as Groove Networks, Roku Technologies, and Aimster lining up to offer workers groupware solutions that don’t require centralized servers, and don’t make users ask the IT department for either help or permission to set them up. The IT workers of any organization larger than 50 people are now in an impossible situation: They are rewarded for negative events-no crashes or breeches-even as workers are inexorably eroding their ability to build or manage a corporate sandbox. The obvious parallel here is with the PC itself; 20 years ago, the mainframe guys laughed at the toy computers workers were bringing into the workplace because they knew that computation was too complex to be handled by anyone other than a centralized group of trained professionals. Today, we take it for granted that workers can manage their own computers. But we still regard network access and configuration as something that needs to be centrally managed by trained professionals, even as workers take network configuration under their control. There is no one right answer-digital security is a trade-off. But no solution that requires centralized control over what network users do will succeed. It’s too early to know what the new compromise between security and flexibility will look like, but it’s not too early to know that the old compromise is over.

Hailstorm: Open Web Services Controlled by Microsoft

First published on O’Reilly’s Openp2p on May 30, 2001.

So many ideas and so many technologies are swirling around P2P — decentralization, distributed computing, web services, JXTA, UDDI, SOAP — that it’s getting hard to tell whether something is or isn’t P2P, and it’s unclear that there is much point in trying to do so just for the sake of a label.

What there is some point in doing is evaluating new technologies to see how they fit in or depart from the traditional client-server model of computing, especially as exemplified in recent years by the browser-and-web-server model. In this category, Microsoft’s Hailstorm is an audacious, if presently ill-defined, entrant. Rather than subject Hailstorm to some sort of P2P litmus test, it is more illuminating to examine where it embraces the centralization of the client-server model and where it departs by decentralizing functions to devices at the network’s edge.

The design and implementation of HailStorm is still in flux, but the tension that exists within HailStorm between centralization and decentralization is already quite vivid.

Background

HailStorm, which launched in March with a public announcement and a white paper, is Microsoft’s bid to put some meat on the bones of its .NET initiative. It is a set of Web services whose data is contained in a set of XML documents, and which is accessed from the various clients (or “HailStorm endpoints”) via SOAP (Simple Object Access Protocol.) These services are organized around user identity, and will include standard functions such as myAddress (electronic and geographic address for an identity); myProfile, (name, nickname, special dates, picture); myCalendar, myWallet; and so on.

HailStorm can best be thought of as an attempt to re-visit the original MS-DOS strategy: Microsoft writes and owns the basic framework, and third-party developers write applications to run on top of that framework.

Three critical things differentiate the networked version of this strategy, as exemplified by HailStorm, from the earlier MS-DOS strategy:

  • First, the Internet has gone mainstream. This means that Microsoft can exploit both looser and tighter coupling within HailStorm — looser in that applications can have different parts existing on different clients and servers anywhere in the world; tighter because all software can phone home to Microsoft to authenticate users and transactions in real time.
  • Second, Microsoft has come to the conclusion that its monopoly on PC operating systems is not going to be quickly transferable to other kinds of devices (such as PDAs and servers); for the next few years at least, any truly ubiquitous software will have to run on non-MS devices. This conclusion is reflected in HailStorm’s embrace of SOAP and XML, allowing HailStorm to be accessed from any minimally connected device.
  • Third, the world has shifted from “software as product” to “software as service,” where software can be accessed remotely and paid for in per-use or per-time-period licenses. HailStorm asks both developers and users to pay for access to HailStorm, though the nature and size of these fees are far from worked out.

Authentication-Centric

The key to shifting from a machine-centric application model to a distributed computing model is to shift the central unit away from the computer and towards the user. In a machine-centric system, the software license was the core attribute — a software license meant a certain piece of software could be legally run on a certain machine. Without such a license, that software could not be installed or run, or could only be installed and run illegally.

In a distributed model, it is the user and not the hardware that needs to be validated, so user authentication becomes the core attribute — not “Is this software licensed to run on this machine?” but “Is this software licensed to run for this user?” To accomplish this requires a system that first validates users, and then maintains a list of attributes in order to determine what they are and are not allowed to do within the system.

HailStorm is thus authentication-centric, and is organized around Passport. HailStorm is designed to create a common set of services which can be accessed globally by authenticated users, and to this end it provides common definitions for:

  • Identity
  • Security
  • Definitions and Descriptions

or as Microsoft puts it:

From a technical perspective, HailStorm is based on Microsoft Passport as the basic user credential. The HailStorm architecture defines identity, security, and data models that are common to all HailStorm services and ensure consistency of development and operation.

Decentralization

The decentralized portion of HailStorm is a remarkable departure for Microsoft: they have made accessing HailStorm services on non-Microsoft clients a core part of the proposition. As the white paper puts it:

The HailStorm platform uses an open access model, which means it can be used with any device, application or services, regardless of the underlying platform, operating system, object model, programming language or network provider. All HailStorm services are XML Web SOAP; no Microsoft runtime or tool is required to call them.

To underscore the point at the press conference, they demonstrated HailStorm services running on a Palm, a Macintosh, and a Linux box.

While Microsoft stresses the wide support for HailStorm clients, the relationship of HailStorm to the Web’s servers is less clear. In the presentation, they suggested that servers running non-Microsoft operating systems like Linux or Solaris can nevertheless “participate” in HailStorm, though they didn’t spell out how that participation would be defined.

This decentralization of the client is designed to allow Hailstorm applications to spread as quickly as possible. Despite their monopoly in desktop operating systems, Microsoft does not have a majority market share for any of the universe of non-PC devices — PDAs, set-tops, pagers, game consoles, cell phones. This is not to say that they don’t have some notable successes — NT has over a third of the server market, the iPaq running the PocketPC operating system is becoming increasingly popular, and the XBox has captured the interest of the gaming community. Nevertheless, hardware upgrade cycles are long, so there is no way Microsoft can achieve market dominance in these categories as quickly.

Enter HailStorm. HailStorm offers a way for Microsoft to sell software and services on devices that aren’t using Microsoft operating systems. This is a big change — Microsoft typically links its software and operating systems (SQLServer won’t run outside an MS environment; Office is only ported to the Mac). By tying HailStorm to SOAP and XML rather than specific client environments, Microsoft says it is giving up its ability to control (or even predict) what software, running on which kinds of devices, will be accessing HailStorm services.

The embrace of SOAP is particularly significant, as it seems to put HailStorm out of reach of many of its other business battles — vs. Java, vs. Linux, vs. PalmOS, and so on — because, according to Microsoft, any device using SOAP will be able to participate in HailStorm without prejudice — “no Microsoft runtime or tool” will be required, though the full effect of this client-insensitivity will be determined by how much Microsoft alters Kerberos or SOAP in ways that limit or prevent other companies from writing HailStorm-compliant applications.

HailStorm is Microsoft’s most serious attempt to date to move from competing on unit sales to selling software as a service, and the announced intention to allow any sort of client to access HailStorm represents a remarkable decentralization for Microsoft.

It is not, however, a total decentralization by any means. In decentralizing their control over the client, Microsoft seeks to gain control over a much larger set of functions, for a much larger group of devices, than they have now. The functions that HailStorm centralizes are in many ways more significant than the functions it decentralizes.

Centralization

In the press surrounding HailStorm, Microsoft refers to its “massively distributed” nature, its “user-centric” model, and even makes reference to its tracking of user presence as “peer-to-peer.” Despite this rhetoric, however, HailStorm as described is a mega-service, and may be the largest client-server installation ever conceived.

Microsoft addressed the requirements for running such a mega-service, saying:

Reliability will be critical to the success of the HailStorm services, and good operations are a core competency required to ensure that reliability. […] Microsoft is also making significant operational investments to provide the level of service and reliability that will be required for HailStorm services. These investments include such things as physically redundant data centers and common best practices across services.

This kind of server installation is necessary for HailStorm, because Microsoft’s ambitions for this service are large: they would like to create the world’s largest address registry, not only of machines but of people as well. In particular, they would like to host the identity of every person on the Internet, and mediate every transaction in the consumer economy. They will fail at such vast goals of course, but succeeding at even a small subset of such large ambitions would be a huge victory.

Because they have decentralized their support of the client, they must necessarily make large parts of HailStorm open, but always with a caveat: while HailStorm is open for developers to use, it is not open for developers to build on or revise. Microsoft calls this an “Open Access” model — you can access it freely, but not alter it freely.

This does not mean that HailStorm cannot be updated or revised by the developer community; it simply means that any changes made to HailStorm must be approved by Microsoft, a procedure they call “Open Process Extensibility.” This process is not defined within the white paper, though it seems to mean revising and validating proposals from HailStorm developers, which is to say, developers who have paid to participate in HailStorm.

With HailStorm, Microsoft is shifting from a strategy of controlling software to controlling transactions. Instead of selling units of licensed software, Hailstorm will allow them to offer services to other developers, even those working on non-Microsoft platforms, while owning the intellectual property which underlies the authentications and transactions, a kind of “describe and defend” strategy.

“Describe and defend” is a move away from “software as unit” to “software as service,” and means that their control of the HailStorm universe will rely less on software licenses and more on patented or copyrighted methods, procedures, and database schema.

While decentralizing client-code, Microsoft centralizes the three core aspects of the service:

  • Identity (using Passport)
  • Security (using Kerberos)
  • Definitions and Descriptions (using HailStorm’s globally standardized schema)

Identity: The goal with Passport is simple — ubiquity. As Bill Gates put it at the press conference: “So it’s our goal to have virtually everybody who uses the Internet to have one of these Passport connections.”

HailStorm provides a set of globally useful services which, because they are authentication-centric, requires all users to participate in its Passport program. This allows Microsoft to be a gatekeeper at the level of individual participation — an Internet user without a Passport will not exist within the system, and will not be able to access or use Passport services. Because users pay to participate in the HailStorm system, in practice this means that Microsoft will control a user’s identity, leasing it to them for use within HailStorm for a recurring fee.

It’s not clear how open the Passport system will be. Microsoft has a history of launching web initiatives with restrictive conditions, and then dropping the restrictions that limit growth: the original deployment of Passport required users to get a Hotmail account, a restriction that was later dropped when this adversely affected the potential size of the Passport program. You can now get a Passport with any email address, and since an email address is guaranteed to be globally unique, any issuer of email addresses is also issuing potentially valid Passport addresses.

The metaphor of a passport suggests that several different entities agree to both issue and honor passports, as national governments presently do with real passports. There are several entities who have issued email addresses to millions or tens of millions of users — AOL, Yahoo, ATT, British Telecom, et al. Microsoft has not spelled out how or whether these entities will be allowed to participate in HailStorm, but it appears that all issuing and validation of Passports will be centralized under Microsoft’s control.

Security: Authentication of a HailStorm user is provided via Kerberos, a secure method developed at MIT for authenticating a request for a service in a computer network. Last year, Microsoft added its own proprietary extension to Kerberos, which creates potential incompatibilities between clients running non-Microsoft versions of Kerberos and servers running Microsoft’s versions.

Microsoft has published the details of its version of Kerberos, but it is not clear if interoperability with the Microsoft version of Kerberos is required to participate in HailStorm, or if there are any licensing restrictions for developers who want to write SOAP clients that use Kerberos to access HailStorm services.

Definitions and Descriptions: This is the most audacious aspect of HailStorm, and the core of the describe-and-defend strategy. Microsoft wants to create a schema which describes all possible user transactions, and then copyright that schema, in order to create and manage the ontology of life on the Internet. In HailStorm as it was described, all entities, methods, and transactions will be defined and mediated by Microsoft or Microsoft-licensed developers, with Microsoft acting as a kind of arbiter of descriptions of electronic reality:

The initial release of HailStorm provides a basic set of possible services users and developers might need. Beyond that, new services (for example, myPhotos or myPortfolio) and extensions will be defined via the Microsoft Open Process with developer community involvement. There will be a single schema for each area to avoid conflicts that are detrimental to users (like having both myTV and myFavoriteTVShows) and to ensure a consistent architectural approach around attributes like security model and data manipulation. Microsoft’s involvement in HailStorm extensions will be based on our expertise in a given area.

The business difficulties with such a system are obvious. Will the airline industry help define myFrequentFlierMiles, copyright Microsoft, when Microsoft also runs the Expedia travel service? Will the automotive industry sign up to help the owner of CarPoint develop myDealerRebate?

Less obvious but potentially more dangerous are the engineering risks in a single, global schema, because there are significant areas where developers might legitimately disagree about how resources should be arranged. Should business users record the corporate credit card as a part of myWallet, alongside their personal credit card, or as part of myBusinessPayments, alongside their EDI and purchase order information? Should a family’s individual myCalendars be a subset of ourCalendar, or should they be synched manually? Is it really so obvious that there is no useful distinction between myTV (the box, through which you might also access DVDs and even WebTV) and myFavorite TVShows (the list of programs to be piped to the TiVo)?

Microsoft proposes to take over all the work of defining the conceptual entities of the system, promising that this will free developers to concentrate their efforts elsewhere:

By taking advantage of Microsoft’s significant investment in HailStorm, developers will be able to create user-centric solutions while focusing on their core value proposition instead of the plumbing.

Unmentioned is what developers whose core value proposition is the plumbing are to do with HailStorm’s global schema. With Hailstorm, Microsoft proposes to divide the world into plumbers and application developers, and to take over the plumbing for itself. This is analogous to the split early in its history when Microsoft wrote the DOS operating system, and let other groups write the software that ran on top of DOS.

Unlike DOS, which could be tied to a single reference platform — the “IBM compatible” PC — HailStorm is launching into a far more heterogeneous environment. However, this also means that the competition is far more fragmented, and given the usefulness of HailStorm to developers who want to offer Web services without rethinking identity or authentication from the ground up (one of the biggest hurdles to widespread use of Sun’s JXTA), and the possible network effects that a global credentials schema could create, HailStorm could quickly account for a plurality of Internet users. Even a 20% share of every transaction made by every Internet user would make Microsoft by far the dominant player in the world of Web services.

Non-Microsoft Participation

With HailStorm, Microsoft has abandoned tying its major software offerings to its client operating systems. Even if every operating system it has — NT/Win2k, PocketPC, Stinger, et al — spreads like kudzu, the majority of the world’s non-PC devices will still not be controlled by Microsoft in any short-term future. By adopting open standards such as XML and SOAP, Microsoft hopes to attract the world’s application developers to write for the HailStorm system now or soon, and by owning the authentication and schema of the system, they hope to be the mediator of all HailStorm users and transactions, or the licenser of all members of the HailStorm federation.

Given the decentralization on the client-side, where a Java program running on a Linux box could access Hailstorm, the obvious question is “Can a HailStorm transaction take place without talking to Microsoft owned or licensed servers?”

The answer seems to be no, for two, and possibly three, reasons.

First, you cannot use a non-Passport identity within HailStorm, and at least for now, that means that using HailStorm requires a Microsoft-hosted identity.

Second, you cannot use a non-Microsoft copyrighted schema to broker transactions within HailStorm, nor can you alter or build on existing schema without Microsoft’s permission.

Third, developers might not be able to write HailStorm services or clients without using the Microsoft-extended version of Kerberos.

At three critical points in HailStorm, Microsoft is using an open standard (email address, Kerberos, SOAP) and putting it into a system it controls, not through software licensing but through copyright (Passport, Kerberos MS, HailStorm schema). By making the system transparent to developers but not freely extensible, Microsoft hopes to gain the growth that comes with openness, while avoiding the erosion of control that also comes with openness.

This is a strategy many companies have tried before — sometimes it works and sometimes it doesn’t. Compuserve collapsed while pursuing a partly open/partly closed strategy, while AOL flourished. Linux has spread remarkably with a completely open strategy, but many Linux vendors have suffered. Sun and Apple are both wrestling with “open enough to attract developers, but closed enough to stave off competitors” strategies with Solaris and OS X respectively.

Hailstorm will not be launching in any real way until 2002, so it is too early to handicap Microsoft’s newest entrant in the “open for users but closed for competitors” category. But if it succeeds at even a fraction of its stated goals, Hailstorm will mark the full-scale arrival of Web services and set the terms of both competition and cooperation within the rest of the industry.

P2P Backlash!

First published on O’Reilly’s OpenP2P.

The peer-to-peer backlash has begun. On the same day, the Wall St. Journal ran an article by Lee Gomes entitled “Is P2P plunging off the deep end?”, while Slashdot’s resident commentator, Jon Katz, ran a review of O’Reilly’s Peer to Peerbook under the title “Does peer-to-peer suck?”

It’s tempting to write this off as part of the Great Wheel of Hype we’ve been living with for years:

New Thing happens; someone thinks up catchy label for New Thing; press picks up on New Thing story; pundits line up to declare New Thing “Greatest Since Sliced Bread.” Whole world not transformed in matter of months; press investigates further; New Thing turns out to be only best thing since soda in cans; pundits (often the same ones) line up to say they never believed it anyway.

This quick reversal is certainly part of the story here. The Journal quoted entrepreneurs and investors recently associated with peer-to-peer who are now distancing themselves from the phrase in order to avoid getting caught in the backlash. There is more to these critiques than business people simply repositioning themselves when the story crescendos, however, because each of the articles captures something important and true about peer-to-peer.

Where’s the money?

The Wall St. Journal’s take on peer-to-peer is simple and direct: it’s not making investors any money right now. Mr. Gomes notes that many of the companies set up to take advantage of file sharing in the wake of Napster’s successes have hit on tough times, and that Napster’s serious legal difficulties have taken the bloom off the file sharing rose. Meanwhile, the distributed computing companies have found it hard to get either customers or investors, as the closing of Popular Power and the difficulties of the remaining field in finding customers have highlighted.

Furthermore, Gomes notes that P2P as a label has been taken on by many companies eager to seem cutting edge, even those whose technologies have architectures that differ scarcely at all from traditional client-server models. The principle critiques Gomes makes — P2P isn’t a well-defined business sector, nor a well-defined technology — are both sensible. From a venture capitalist’s point of view, P2P is too broad a category to be a real investment sector.

Is P2P even relevant?

Jon Katz’s complaints about peer-to-peer are somewhat more discursive, but seem to center on its lack of a coherent definition. Like Gomes, he laments the hype surrounding peer-to-peer, riffing off a book jacket blurb that overstates peer-to-peer’s importance, and goes on to note that the applications grouped together under the label peer-to-peer differ from one another in architecture and effect, often quite radically.

Katz goes on to suggest that interest in P2P is restricted to a kind of techno-elite, and is unlikely to affect the lives of “Harry and Martha in Dubuque.” While Katz’s writing is not as focused as Gomes’, he touches on the same points: there is no simple definition for what makes something peer-to-peer, and its application in people’s lives is unclear.

The unspoken premise of both articles is this: if peer-to-peer is neither a technology or a business model, then it must just be hot air. There is, however a third possibility besides “technology” and “business.” The third way is simply this: Peer-to-peer is an idea.

Revolution convergence

As Jon Orwant noted recently in these pages, “ Peer-to-peer is not a technology, it’s a mindset.”” Put another way, peer-to-peer is a related group of ideas about network architecture, ideas about how to achieve better integration between the Internet and the personal computer — the two computing revolutions of the last 15 years.

The history of the Internet has been told often — from the late ’60s to the mid-’80s, the DARPA agency in the Department of Defense commissioned work on a distributed computer network that used packet switching as a way to preserve the fabric of the network, even if any given node failed.

The history of the PC has likewise been often told, with the rise of DIY kits and early manufacturers of computers for home use — Osborne, Sinclair, the famous Z-80, and then the familiar IBM PC and with it Microsoft’s DOS.

In an accident of history, both of those movements were transformed in January 1984, and began having parallel but increasingly important effects on the world. That month, a new plan for handling DARPA net addresses was launched. Dreamed up by Vint Cerf, this plan was called the Internet Protocol, and required changing the addresses of every node on the network over to one of the new IP addresses, a unique, global, and numerical address. This was the birth of the Internet we have today.

Meanwhile, over at Apple Computer, January 1984 saw the launch of the first Macintosh, the computer that popularized the graphic user interface (GUI), with its now familiar point-and-click interactions and desktop metaphor. The GUI revolutionized the personal computer and made it accessible to the masses.

For the next decade, roughly 1984 to 1994, both the Internet and the PC grew by leaps and bounds, the Internet as a highly connected but very exclusive technology, and the PC as a highly dispersed but very inclusive technology, with the two hardly intersecting at all. One revolution for the engineers, another for the masses.

The thing that changed all of this was the Web. The invention of the image tag, as part of the Mosaic browser (ancestor of Netscape), brought a GUI to the previously text-only Internet in exactly the same way that, a decade earlier, Apple brought a GUI to the previously text-only operating system. The browser made the Internet point-and-click easy, and with that in place, there was suddenly pressure to fuse the parallel revolutions, to connect PCs to the Internet.

Which is how we got the mess we have today.

First and second-class citizens

In 1994, the browser created sudden pressure to wire the world’s PCs, in order to take advantage of the browser’s ability to make the network easy to use. The way the wiring happened, though — slow modems, intermittent connections, dynamic or even dummy IP addresses — meant that the world’s PCs weren’t being really connected to the Internet, so much as they were being hung off its edges, with the PC acting as no more than a life-support system for the browser. Locked behind their slow modems and impermanent addresses, the world’s PC owners have for the last half-dozen years been the second-class citizens of the Internet.

Anyone who wanted to share anything with the world had to find space on a “real” computer, which is to say a server. Servers are the net’s first-class citizens, with real connectivity and a real address. This is how the Geocities and Tripods of the world made their name, arbitraging the distinction between the PCs that were (barely) attached to the networks edge and the servers that were fully woven into the fabric of the Internet.

Big, sloppy ideas

Rejection of this gap between client and server is the heart of P2P. As both Gomes and Katz noted, P2P means many things to many people. PC users don’t have to be second-class citizens. Personal computers can be woven directly into the Internet. Content can be provided from the edges of the network just as surely as from the center. Millions of small computers, with overlapping bits of content, can be more reliable than one giant server. Millions of small CPUs, loosely coupled, can do the work of a supercomputer.

These are sloppy ideas. It’s not clear when something stops being “file sharing” and starts being “groupware.” It’s not clear where the border between client-server and peer-to-peer is, since the two-way Web moves power to the edges of the network while Napster and ICQ bootstrap connections from a big server farm. It’s not clear how ICQ and SETI@Home are related, other than deriving their power from the network’s edge.

No matter. These may be sloppy ideas, ideas that don’t describe a technology or a business model, but they are also big ideas, and they are also good ideas. The world’s Net-connected PCs host, both individually and in aggregate, an astonishing amount of power — computing power, collaborative power, communicative power.

Our first shot at wiring PCs to the Internet was a half-measure — second-class citizenship wasn’t good enough. Peer-to-peer is an attempt to rectify that situation, to really integrate personal devices into the Internet. Someday we will not need a blanket phrase like peer-to-peer, because we will have a clearer picture of what is really possible, in the same way the arrival of the Palm dispensed with any need to talk about “pen-based computing.” 

In the meantime, something important is happening, and peer-to-peer is the phrase we’ve got to describe it. The challenge now is to take all these big sloppy ideas and actually do something with them, or, as Michael Tanne of XDegrees put it at the end of the Journal article:

“P2P is going to be used very broadly, but by itself, it’s not going to create new companies. …[T]he companies that will become successful are those that solve a problem.”

Time-Warner and ILOVEYOU

First published in FEED, 05/00.

Content may not be king, but it was certainly making headlines last week. From the “content that should have been distributed but wasn’t” department, Time Warner’s spectacularly ill-fated removal of ABC from its cable delivery lineup ended up cutting off content essential to the orderly workings of America — Who Wants to Be A Millionaire? Meanwhile, from the “content that shouldn’t have been distributed but was” department, Spyder’s use of a loosely controlled medium spread content damaging to the orderly workings of America and everywhere else — the ILOVEYOU virus. Taken together, these events are making one message increasingly obvious: The power of corporations to make decisions about distribution is falling, and the power of individuals as media channels in their own right is rising.

The week started off with Time Warner’s effort to show Disney who was the boss, by dropping ABC from its cable lineup. The boss turned out to be Disney, because owning the delivery channel doesn’t give Time Warner half the negotiating leverage the cable owners at Time Warner thought it did. Time Warner was foolish to cut off ABC during sweeps month, when Disney had legal recourse, but their real miscalculation was assuming that owning the cable meant owning the customer. What had ABC back on the air and Time Warner bribing its customers with a thirty-day rebate was the fact that Americans resent any attempt to interfere with the delivery of content, legal issues or no. Indeed, the aftermath saw Peter Vallone of the NY City Council holding forth on the right of Americans to watch television. It is easy to mock this attitude, but Vallone has a point: People have become accustomed to constantly rising media access, from three channels to 150 in a generation, with the attendant rise in user access to new kinds of content. Any attempt to reintroduce artificial scarcity by limiting this access now creates so much blind fury that television might as well be ranked alongside water and electricity as utilities. The week ended as badly for Time Warner as it began, because even though their executives glumly refused to promise never to hold their viewers hostage as a negotiating tactic, their inability to face the wrath of their own paying customers had been exposed for all the world to see.

Meanwhile, halfway round the world, further proof of individual leverage over media distribution was mounting. The ILOVEYOU virus struck Thursday morning, and in less than twenty-four hours had spread further than the Melissa virus had in its entire life. The press immediately began looking for the human culprit, but largely missed the back story: The real difference between ILOVEYOU and Melissa was not the ability of Outlook to launch programs from within email, a security hole unchanged since last year. The real difference was the delivery channel itself — the number and interconnectedness of e-mail users — that makes ILOVEYOU more of a media virus than a computer virus. The lesson of a virus that starts in the Philippines and ends up flooding desktops from London to Los Angeles in a few hours is that while email may not be a mass medium, that reaches millions at the same time, it has become a massive one, reaching tens of millions in mere hours, one user at a time. With even a handful of globally superconnected individuals, the transmission rates for e-mail are growing exponentially, with no end in sight, either for viruses or legitimate material. The humble practice of forwarding e-mail, which has anointed The Onion, Mahir, and the Dancing Baby as pop-culture icons, has now crossed one of those invisible thresholds that makes it a new kind of force — e-mail as a media channel more global than CNN. As the world grows more connected, the idea that individuals are simply media consumers looks increasingly absurd — anyone with an email address is in fact a media channel, and in light of ILOVEYOU’s success as a distribution medium, we may have to revise that six degrees of separation thing downwards a little.

Both Time Warner’s failure and ILOVEYOUs success spread the bad news to several parties: TV cable companies, of course, but also cable ISPs, who hope to use their leverage over delivery to hold Internet content hostage; the creators of WAP, who hope to erect permanent tollbooths between the Internet and the mobile phone without enraging their subscribers; governments who hoped to control their citizens’ access to “the media” before e-mail turned out to be a media channel as well; and everyone who owns copyrighted material, for whom e-mail attachments threaten to create hundreds of millions of small leaks in copyright protection. (At least Napster has a business address.) There is a fear, shared by all these parties, that decisions about distribution — who gets to see what, when — will pass out of the hands of governments and corporations and into the hands of individuals. Given the enormity of the vested interests at stake, this scenario is still at the outside edges of the imaginable. But when companies that own the pipes can’t get any leverage over their users, and when users with access to e-mail can participate in a system whose ubiquity has been so dramatically illustrated, the scenario goes from unthinkable to merely unlikely.

The Real Wireless Innovators

First published on Biz2, 04/09/2001.

There is a song making the rounds in the wireless world right now that goes a
little something like this: “WAP was overhyped by the media, but we never expected
it to be a big deal. The real wireless action is coming in the future, from things
such as 3G and m-commerce. There is nothing wrong with what we are doing-wireless is simply taking a while to develop, just as the Web did.”

Don’t believe it. The comparison between the early days of the Web and wireless is
useful, but it is anything but favorable to wireless. The comparison actually
highlights what has gone wrong with wireless data services so far, and how much ground the traditional wireless players are giving up to new competitors, who have a much better idea of what users want and a much longer history of giving it to them.

As anyone who was around in 1993 can tell you, the Web was useful right out of the
box. Even in the days of the text-only Internet, Tim Berners-Lee’s original Web browser blew the other text-only search tools such as Archie and Gopher right out of the water. Unlike WAP, the Web got to where it is today by being useful when it launched, and staying useful every single day since.

Contrast the early user experiences with wireless data. When makers of wireless phones first turned their efforts to data services, they proposed uses for the wireless Web that ranged from the unimaginative (weather forecasts) to the downright ghastly (ads that ring your phone when you walk by a store).

Because the phone companies thought they owned their customers, it never occurred to them that a numeric keypad and a tiny screen might not be adequate for email. They seem to have actually believed that they had all the time in the world to develop their wireless data offerings-after all, who could possibly challenge them? So they have allowed companies that understand flexible devices, such as Motorola (MOT, info) and Research in Motion, to walk away with the wireless email market, the once and future killer app.

Wireless telcos would like you to believe that these are all just growing pains, but
there is another explanation for the current difficulties of the wireless sector:
Telephone companies are not very good at producing anything but telephones. Everything about the telcos-makers of inflexible hardware, with a form-factor optimized for voice, and notoriously bad customer service-suggests that they would be the last people on Earth you would trust to create a good experience with things such as wireless email or portable computing.

As always, the great exception here is NTT DoCoMo, which had the sense to embrace HTML (actually, a subset called compact HTML) and let anyone build content that its i-mode device could read. And NTT DoCoMo also made sure the services it provides do something its customers are interested in-and in many cases are willing to pay for.

The technology is not the difficult part of making useful wireless devices. The
companies creating good wireless customer experiences-Research in Motion with its
BlackBerry, Apple (AAPL, info) Computer with its AirPort wireless networking technology, and Motorola with its Talkabout-are companies that know how to create good customer experiences, period. If you know what customers want and how to give it to them, it is easier to go wireless than if you know only wireless technology and have to figure out what customers want.

Own worst enemies
The difficulties in the early days of wireless data had nothing to do with telcos
needing time to develop their services. Instead, those difficulties were caused by the
telcos’ determination to maintain a white-knuckled grip on their customers, a
determination that made them unwilling to embrace existing standards or share revenue with potential affiliates. Ironically, this grip has made it easier, not more difficult, for competitors to muscle in, because the gap between what users want and what the telcos were providing was so large.

The wireless sector is slowly melting, becoming part of lots of other businesses. If
you want to know who will create a good wireless shopping experience, bet on Amazon.com (AMZN, info), not Ericsson (ERICY, info). If you want to know who will create the best m-commerce infrastructure, look to Citibank, not Nokia (NOK, info). Contrary to the suggestion that the wireless sector will live apart from the rest of the technology landscape, wireless is an adjective-the things that make a good wireless personal digital assistant or a good wireless computer are very different from those that make a good wireless phone.

This is not to say there isn’t a fortune to be made in supplying wireless phones. Nor
is being a wireless network for BlackBerrys and Talkabouts a bad business-as I write
this column, GoAmerica (GOAM, info) Communications is doing quite well.

But the real breakout wireless services are being launched not by the telcos but by
innovative device and service companies who think of wireless as a feature, not as
an end in itself.