In-Room Chat as a Social Tool

First published on O’Reilly’s OpenP2P on December 26, 2003.

This fall, I hosted a two-day brainstorming session for 30 or so people on the subject of social software. The event, sponsored by Cap-Gemini’s Center for Business Innovation and Nokia’s Insight and Foresight unit, took place in an open loft, and in addition to the usual “sit around a big table and talk to each other” format, we set up an in-room chat channel accessible over the WiFi network. We hosted the chat using Greg Elin’s modifications to Manuel Kiessling’s lovely ARSC (A Really Simple Chat) software. (Greg and I had used a similar setup in a somewhat different setting, and we were determined to experiment further at the social software event.)

The in-room chat created a two-channel experience — a live conversation in the room, and an overlapping real-time text conversation. The experiment was a strong net positive for the group. Most social software is designed as a replacement for face-to-face meetings, but the spread of permanet (connectivity like air) provides opportunities for social software to be used by groups who are already gathered in the same location. For us, the chat served as a kind of social whiteboard. In this note, I want to detail what worked and why, what the limitations and downsides of in-room chat were, and point out possible future avenues for exploration.

The Setup

The setup was quite simple. We were working in a large open loft, seated around a ring of tables, and we connected a WiFi hub to the room’s cable modem. Most of the participants had WiFi-capable laptops, and ARSC works in a browser, so there were no client issues. We put a large plasma screen at one end of the room.

We created a chat room for the event, and asked the participants using the chat to log in using their first name. In addition, we created a special username, Display, which we logged into a machine connected to the plasma screen. The Display interface had no text-entry field, suppressed control messages, and had its font set very large. This maximized screen real estate for user-entered messages, and made them readable even by participants sitting 10 meters away (though it minimized the amount of scroll-back visible on the plasma screen.)


Figure 1. The conference room.


Figure 2. Private and public screens.

Photos courtesy http://www.heiferman.com

We made the participants aware of the chat room at the beginning of the event, and set no other rules for its use (though at one point, we asked that people only use the chat room, saying nothing out loud for half an hour.) The chat room was available throughout the meeting. The first 10 minutes of the chat were the usual set of test messages, hellos, and other “My hovercraft is full of eels” randomness, but once the meeting got rolling, the chat room became an invaluable tool.

The Advantages

The chat room created several advantages.

1. It changed interrupt logic

Group conversations are exercises in managing interruptions. When someone is speaking, the listeners are often balancing the pressure to be polite with a desire to interrupt, whether to add material, correct or contradict the speaker, or introduce an entirely new theme. These interruptions are often tangential, and can lead to still more interruptions or follow-up comments by still other listeners. Furthermore, conversations that proceed by interruption are governed by the people best at interrupting. People who are shy, polite, or like to take a moment to compose their thoughts before speaking are at a disadvantage.

Even with these downsides, however, the tangents can be quite valuable, so if an absolute “no interrupt” rule were enforced, at least some material of general interest would be lost, and the frustration level among the participants consigned solely to passive listening would rise considerably. 

The chat room undid these effects, because participants could add to the conversation without interrupting, and the group could pursue tangential material in the chat room while listening in the real room. It was remarkable how much easier it was for the speaker to finish a complex thought without being cut off. And because chat participants had no way of interrupting one another in the chat room, even people not given to speaking out loud could participate. Indeed, one of our most active participants contributed a considerable amount of high-quality observation and annotation while saying almost nothing out loud for two days.

2. “Note to self” became “Note to world”

The more successful a meeting, the more “note to self” moments happen, where a light goes off in someone’s head, and they are moved to write the insight down for later examination. The chat channel provided an interesting alternative to personal note-taking, which was group note-taking. By entering “notes to self” into the chat, participants could both archive thoughts (the chat was logged) and to share those thoughts with the rest of the room to see what reactions they might spark. This is slightly different than simply altering interrupt logic, and more along the lines of Cory Doctorow’s “outboard brain” idea, because in this case, the chat was capturing material that would not otherwise have been shared with the group.

3. High-quality text annotation

What the spoken word has in emotive quality, it lacks in precision. Much interesting material thrown out during the course of group conversations is difficult to capture in an ideal form. When taking notes, it’s easy to misspell a name or mis-punctuate an URL, and things committed solely to memory can be difficult to retrieve later (“Somebody said something about a researcher in Oregon? Uraguay? The name began with a G …”). Comments in the chat log solved these problems — if the attendees were talking about Gerd Korteum’s work, or the Kuro5hin website, the spelling and punctuation were unambiguous.

There were two additional effects that improved the quality of the text annotation. Because everyone was connected to the Web, not just the local chat, the participants could Google for Web sites and quotes before they posted. (At one point during the Friday session, a fierce rain started, and someone pasted the US Weather service advisory for the area into the chat.) And because ARSC turns URLs into links, the rest of the group could click on a link in the chat window when it was added, so that new material could be glanced at and bookmarked in context, rather than hours or days later.

The annotation was also affected by the one-way relation between the real world conversation and the chat. Though it’s too early to know whether this was a bug or a feature, themes from the real world conversation were constantly reflected in the chat room, but almost never vice-versa. This suggests that the participants regarded the chat as a place for ancillary comments, rather than a separate but equal conversational space.

4. Less whispering, more \whispering

People whisper to one another during conferences, sometimes for good reasons (“What does UDDI mean?”) and sometimes for not-so-good ones (“So this Estonian guy goes into a bar …”). Like interrupting, however, a blanket “No whispering” ban would throw the good out with the bad, and would reduce the quality of the experience for the attendees. Furthermore, even when there is a good reason to whisper to someone, the larger the conference, the likelier it is you won’t be seated next to them.

The \whisper command in ARSC means that, topologically, everyone is seated next to everyone else. By typing “\whisper Rusty,” a participant could send a point-to-point message to Rusty without disrupting the meeting. Though the whispers weren’t logged, an informal poll at the end of the second day showed that a large majority of chat room participants had used \whisper at some point.

Ironically, the effectiveness of the \whisper command was somewhat limited by the “split screen” focus between the room and the chat. Because \whisper requests went to the invitee’s laptop, if someone was looking away from their screen for a few minutes, they would miss the invitation, since \whisper requests didn’t go to the plasma screen. One user suggested the addition of a \pssst function of some sort to get someone’s attention. Another possibility would be making a second \whisper-only window, so that \whisper conversations could be more asynchronous.

5. Alleviated boredom

Groups of people have diverse interests, so no matter how generally scintillating a meeting overall, at some point someone is going to find the subject at hand dull. The in-room chat helped alleviate this boredom, while keeping the participants talking to one another about the subject at hand.

This is the advantage hardest to understand in the abstract. When I talk about the in-room chat, people often ask “But isn’t that distracting? Don’t you want to make people pay attention to the speaker?” This is similar to the question from the early days of the Web: “But why have any outside links at all? Don’t you want to make people stay on your site?”

Once you assume permanet, whether from Wifi, Richochet, or GPRS, this logic crumbles. Anyone with a laptop or phone can, if they are bored, turn to the Internet, and the question becomes “Given that attendees will be using the network, would you rather have them talking to one another, or reading Slashdot?” The people who gathered in NYC came to converse with one another, and the in-room chat provided a way for them to meet that goal even when they were not riveted by the main event.

The Context

Chat as a meeting tool isn’t a universally good idea, of course. Every successful use of social software has environmental factors working in its favor.

First and foremost, the attendees were tech-savvy people who travel with WiFi-capable laptops and think about novel uses of social software, so they were inclined to want to use something like ARSC, even if only as an experiment. There was no resistance to trying something so odd and unproven, as there might be in less-techie groups.

The group was also self-motivated. Because their attendance was optional, they largely stayed on-topic. One can easily imagine that in a meeting where attendance is passive and forced (“The boss wants everyone in the conference room at 5:45”) the contents of the chat would be much more tangential (to say nothing of libelous). Since most parliamentary rules, whether formal or informal, begin with the premise that only one person can speak at once, and then arrange elaborate rules for determining who can speak when, the presence of an alternate channel could severely disrupt highly-structured meetings, such as client conferences or legal negotiations. Whether this would be a bug or a feature depends on your point of view.

The goals of the meeting were in synch with the experience the chat room offered. We were not trying to forge a working group, get to consensus, or even converge on a small set of ideas. Indeed, the goals of the meeting were explicitly divergent, trying to uncover and share as much new material as possible in a short period. The chat room aided this goal admirably.

The scale of the meeting also worked in our favor. The group was large enough that sitting around a table with a laptop open wasn’t rude or disruptive, but small enough that everyone could use a single chat room. At one point during the Saturday session, we broke into small groups to brainstorm around specific problems, and though there was no explicit request to do so, every single member of the group shut their laptop screens for two hours. Groups of six are small enough that all the members can feel engaged with the group, and the chat would have been much less useful and much more rude in that setting.

On the other hand, whenever things got really active on the chat channel (we averaged about four posts a minute, but it sometimes spiked to 10 or so), people complained about the lack of threading, suggesting that 30 was at or near an upper limit for participation.

Meeting Structure

There were also some more technical or formal aspects of the meeting that worked in our favor.

The plasma screen showing the Display view was surprisingly important. We had not announced the WiFi network or chat channel in advance, and we had no idea how many people would bring WiFi-capable laptops. (As it turned out, most did.) The plasma screen was there to share the chat room’s contents with the disconnected members. However, the screen also added an aspect of social control — because anything said in the chat room was displayed openly, it helped keep the conversation on-topic. Curiously, this seemed to be true even though most of the room was reading the contents of the chat on their laptop screens. The plasma screen created a public feeling without actually exposing the contents to a “public” different from the attendees.

During a brief post-mortem, several users reported using the plasma screen for chunking, so that they could mainly pay attention to the speaker, but flash their eyes to the screen occasionally to see what was in the chat room, taking advantage of the fact that most people read much faster than most speakers talk. (Viz. the horror of the speaker who puts up a PowerPoint page, and then reads each point.)

There were two bits of organizational structure that also helped shape the meeting. The first was our adoption of Jerry Michalski’s marvelous “Red Card/Green Card” system, where participants were given a set of colored cards about 20 cm square in three colors, red, green, and gray. The cards were used to make explicit but non-verbal commentary on what was being said at the time. A green card indicates strong assent, red strong dissent, and gray confusion.

In an earlier experiment with ARSC, Greg added virtual cards; users could click on red or green icons and have those added to the chat. This proved unsatisfying, and for this meeting we went back to the use of physical cards. The use of the cards to indicate non-verbal and emotive reactions seemed to provide a nice balance with the verbal and less emotive written material. At one point, we spent half an hour in conversation with the only rule being “No talking.” The entire room was chatting for 30 minutes, and even in that situation, people would physically wave green cards whenever anyone posted anything particularly worthy in the chat room.

While the no-talking experiment was interesting, it was not particularly useful. One of the participants whose work was being discussed in the chat (he had just finished talking when we entered the no-talking period) reported missing the actual verbal feedback from colleagues. The chat comments made about his ideas, while cogent, lacked the emotional resonance that makes face-to-face meetings work. By enforcing the no-talking rule, we had re-created some of the disadvantages of virtual meetings in a real room.

The other bit of organizational structure was borrowed from Elliott Maxwell and the Aspen Institute, where participants wanting to speak would turn their name cards vertically, thus putting comments in a queue. This was frustrating for many of the participants, who had to wait several minutes to react to something. This also severely altered interrupt logic. (At several points, people gave up their turn to speak, saying “the moment has passed.”) Despite the frustration it caused, this system kept us uncovering new material as opposed to going down rat holes, and it made the chat room an attractive alternative for saying something immediate.

The Disadvantages

Though we found ARSC to be a useful addition to the meeting, there was an unusually good fit between the tool and the environment. For every favorable bit of context listed above, there is a situation where an in-room chat channel would be inappropriate. Meetings where the attention to the speaker needs to be more total would be problematic, as would situations where a majority of the audience is either not connected or uncomfortable with chat.

Even in this group, not everyone had a laptop, and for those people, the chat contents were simply a second channel of information that they could observe but not affect. Absolute ubiquity of the necessary hardware is some way off for even tech-savvy groups, and several years away, at least, for the average group. Any meeting wanting to implement a system like this will have to take steps to make the chat optional, or to provide the necessary hardware where it is lacking.

It may also be that increasing phone/PDA fusion will actually reduce the number of laptops present at meetings. Using social tools at events where phones are the personal device of choice will require significant additional thought around the issues of small screens, thumb keyboards, and other ergonomic issues.

In-room chat is unlikely to be useful for small groups (fewer than a dozen, at a guess), and its usefulness for groups larger than 30 may also be marginal (though in that case, ways of providing multiple chat rooms may be helpful).

Given that the most profound effects of the chat were in changing interrupt logic, many of the downsides came from the loss of interruption as a tool. As annoying as interruptions may be, they help keep people honest, by placing a premium on brevity, and on not spinning castles in the air. Without interruption, the speaker loses an important feedback mechanism, and may tend towards long-winded and unsupported claims. At the very least, the use of in-room chat puts a premium on strong moderation of some sort, to make up for the structural loss of interruption.

Perhaps most importantly, it will almost certainly be unhelpful for groups that need to function as a team. Because the two-track structure encourages a maximum number of new and tangential items being placed together, it would probably be actively destructive for groups where consensus was a goal. As Steven Johnson has noted about the event, the chat room moved most of the humor from real world interjections to network ones, which preserved the humor but suppressed the laughter. (Most of then time when people write “lol,” they aren’t.) Though this helps on the “interrupt logic” front, it also detracts from building group cohesion.

This experiment was relatively small and short, having been applied to one group over two days. It would be interesting to know what the effect would be for groups meeting for longer periods. On the first day, we averaged not quite three and a half posts a minute, occasionally spiking to nine or 10. On the second day, the average rose to just over four posts a minute, but the spikes didn’t change, suggesting that users were becoming more accustomed to a steady pace of posting.

During the half-hour of chat only/no talking out loud, the average nearly tripled, to over 11 posts a minute, and occasionally spiking to 18, suggesting that during the normal sessions, users were paying what Linda Stone calls “continuous partial attention” to both the chat room and the real room, and with the structures of the real room artificially suppressed, the chat room exploded. (The question of whether the change in posting rate was uniform among all users is difficult to answer with the small sample data.)

Several additional experiments suggest themselves: 

  • For conferences whose sessions average no more than 30 users, having each room assigned its own chat channel could let users listening to the same talk find one another with little effort. Likewise, finding ways of forking large groups into multiple chats might be worthwhile, whether using specific characteristics (UI designers in one, information architects in another, and so on) or arbitrary ones (even or odd date of birth) to keep the population of any one channel in the 12-25 range. (The lack of plasma screen as a public mediating factor might be an issue.)
  • ARSC translates URLs into clickable links. This suggests other regular expressions that could be added. “A:” plus an author name or “T:” plus a title could be turned into Amazon lookups. A QuoteBot might be very useful, as several times during the two days someone asked “Who said ‘Let the wild rumpus start’?” A social network analysis bot might be interesting, logging things like most and least frequent posters, and social clustering, and reflecting those back to the group. (Cameron Marlow of blogdex wrote a simple program during the meeting to display the number of chat posts per user.)
  • The social network angle, of course, is hampered by the lack of threading in chat. This is obviously a hard problem, but several people wondered whether there might be a lightweight way to indicate who you are responding to, to create rudimentary threading. This may be a problem best fixed socially. If we had asked people to adopt the general IRC convention of posting with the name of the recipient first (“greg: interesting idea, but almost certainly illegal”), we might have gotten much better implicit threading. This in turn would have been greatly helped by tab-completion of nicknames in ARSC.
  • The \whisper function is secret, rather than private. In a real meeting, seeing who is having a side conversation can allow the group as a whole to feel the overall dynamic, so a private \whisper function might be an interesting addition, entering lines into the chat like “Clay whispers to Cameron,” but providing no information about the content of those conversations.
  • Likewise, it might be useful to flag interesting or relevant posts for later review. If someone says something particularly cogent, other users could click a link next to the post labeled “Archive me,” and in addition to appearing in the general log, such posts would go to a second “flagged comments only” log.
  • Greg provided a polling function, but you had to click off the chat page to get there, and it was only used once, as a test. Given this failure, polling and voting functions may need to appear directly in the chat room to be useful. Bots are an obvious interface to do this as well. A PollingBot could ask questions in the chat room and accept answers by \whisper.
  • Given that meetings generally involve people looking at similar issues from different backgrounds, DCC-like user-to-user file transfer might be a valuable tool for sharing background materials among the participants, by letting them send local files as well as URLs over the chat interface.
  • We got close to the edge of IRC-style chaos, where the chat scrolls by too fast to read it. A buffering chat channel might solve this problem, by having some maximum rate set on the order of 120-150 words a minute, and then simply delaying posts that go over that limit into the next minute, and so on. This congestion queuing would let everyone say what they want to say without dampening the ability of other participants to take it all in before reaction. 
  • Finally, ARSC is server-based. With Zeroconf networking, it might be possible to set up ad hoc, peer-to-peer networks of laptops without needing to coordinate anything in advance. Likewise, while DCC-ish file transfer might be valuable for person-to-person file sharing, the ability to post “I have a draft of my article on my hard drive at such and such a local address” and have that material be as accessible as if it were on the Web would make public sharing of background materials much easier.

Conclusion

Real world groups are accustomed to having tools to help them get their work done — flipcharts, white boards, projectors, and so on. These tools are typically used by only one person at a time. This experiment demonstrated the possibility of social tools, tools that likewise aid a real-world group, but that are equally accessible to all members at the same time. There are a number of other experiments one can imagine, from using the chat to accept input from remote locations to integrating additional I/O devices such as cameras or projectors. The core observation, though, is that under certain conditions, groups can find value in participating in two simultaneous conversation spaces, one real and one virtual.

The RIAA Succeeds Where the Cypherpunks Failed

First published December 17, 2003 on the “Networks, Economics, and Culture” mailing list. 

For years, the US Government has been terrified of losing surveillance powers over digital communications generally, and one of their biggest fears has been broad public adoption of encryption. If the average user were to routinely encrypt their email, files, and instant messages, whole swaths of public communication currently available to law enforcement with a simple subpoena (at most) would become either unreadable, or readable only at huge expense. 

The first broad attempt by the Government to deflect general adoption of encryption came 10 years ago, in the form of the Clipper Chip. The Clipper Chip was part of a proposal for a secure digital phone that would only work if the encryption keys were held in such a way that the Government could get to them. With a pair of Clipper phones, users could make phone calls secure from everyone except the Government. 

Though opposition to Clipper by civil liberties groups was swift and extreme, the thing that killed it was work by Matt Blaze, a Bell Labs security researcher, showing that the phone’s wiretap capabilities could be easily defeated, allowing Clipper users to make calls that even the Government couldn’t decrypt. (Ironically, ATT had designed the phones originally, and had a contract to sell them before Blaze sunk the project.)

The Government’s failure to get the Clipper implemented came at a heady time for advocates of digital privacy — the NSA was losing control of cryptographic products, Phil Zimmerman had launched his Pretty Good Privacy (PGP) email program, and the Cypherpunks, a merry band of crypto-loving civil libertarians, were on the cover of the second issue of Wired. The floodgates were opening, leading to…

…pretty much nothing. Even after the death of Clipper and the launch of PGP, the Government discovered that for the most part, users didn’t want to encrypt their communications. The most effective barrier to the spread of encryption has turned out to be not control but apathy. Though business users encrypt sensitive data to hide it from one another, the use of encryption to hide private communications from the Government has been limited mainly to techno-libertarians and a small criminal class.

The reason for this is the obvious one: the average user has little to hide, and so hides little. As a result, 10 years on, e-mail is still sent as plain text, files are almost universally unsecured, and so on. The Cypherpunk fantasy of a culture that routinely hides both legal and illegal activities from the state has been defeated by a giant distributed veto. Until now. 

It may be time to dust off that old issue of Wired, because the RIAA is succeeding where 10 years of hectoring by the Cypherpunks failed. When shutting down Napster turned out to have all the containing effects of stomping on a tube of toothpaste, the RIAA switched to suing users directly. This strategy has worked much better than shutting down Napster did, convincing many users to stop using public file sharing systems, and to delete MP3s from their hard drives. However, to sue users, they had to serve a subpoena, and to do that, they had to get their identities from the user’s internet service providers.

Identifying those users has had a second effect, and that’s to create a real-world version of the scenario that drove the invention of user-controlled encryption in the first place. Whitfield Diffie, inventor of public key encryption, the strategy that underlies most of today’s cryptographic products, saw the problem as a version of “Who will guard the guardians?” 

In any system where a user’s identity is in the hands of a third party, that third party cannot be trusted. No matter who the third party is, there will be at least hypothetical situations where the user does not want his or her identity revealed, but the third party chooses or is forced to disclose it anyway. (The first large scale example of this happening was the compromise of anon.penet.fi, the anonymous email service, in 1994.) Seeing that this problem was endemic to all systems where third parties had access to a user’s identity, Diffie set out to design a system that put control of anonymity directly in the hands of the user.

Diffie published theoretical work on public key encryption in 1975, and by the early 90s, practical implementations were being offered to the users. However, the scenario Diffie envisioned had little obvious relevance to users, who were fairly anonymous on the internet already. Instead of worrying now about possible future dangers, most users’ privacy concerns centered on issues local to the PC, like hiding downloaded pornography, rather than on encrypting network traffic.

However, Diffie’s scenario, where legal intervention destroys the users’ de facto privacy wherever it is in the hands of commercial entities, is now real. The RIAA’s successful extraction of user identity from internet service providers makes it vividly clear that the veil of privacy enjoyed by the average internet user is diaphanous at best, and that the obstacles to piercing that veil are much much lower than for, say, allowing the police to search your home or read your (physical) mail. Diffie’s hypothetical problem is today’s reality. As a result, after years of apathy, his proposed solution is being adopted as well.

In response to the RIAA’s suits, users who want to share music files are adopting tools like WINW and BadBlue, that allow them to create encrypted spaces where they can share files and converse with one another. As a result, all their communications in these spaces, even messages with no more commercial content than “BRITN3Y SUX!!!1!” are hidden from prying eyes. This is not because such messages are sensitive, but rather because once a user starts encrypting messages and files, it’s often easier to encrypt everything than to pick and choose. Note that the broadening adoption of encryption is not because users have become libertarians, but because they have become criminals; to a first approximation, every PC owner under the age of 35 is now a felon. 

The obvious parallel here is with Prohibition. By making it unconstitutional for an adult to have a drink in their own home, Prohibition created a cat and mouse game between law enforcement and millions of citizens engaged in an activity that was illegal but popular. As with file sharing, the essence of the game was hidden transactions — you needed to be able to get into a speakeasy or buy bootleg without being seen.

This requirement in turn created several long-term effects in American society, everything from greatly increased skepticism of Government-mandated morality to broad support for anyone who could arrange for hidden transactions, including organized crime. Reversing the cause did not reverse the effects; both the heightened skepticism and the increased power of organized crime lasted decades after Prohibition itself was reversed.

As with Prohibition, so with file sharing — the direct effects from the current conflict are going to be minor and over quickly, compared to the shifts in society as a whole. New entertainment technology goes from revolutionary to normal quite rapidly. There were dire predictions made by the silent movie orchestras’ union trying to kill talkies, or film executives trying to kill television, or television executives trying to kill the VCR. Once those technologies were in place, however, it was hard to remember what all the fuss was about. Though most of the writing about file sharing concentrates on the effects on the music industry, whatever new bargain is struck between musicians and listeners will almost certainly be unremarkable five years from now. The long-term effects of file sharing are elsewhere.

The music industry’s attempts to force digital data to behave like physical objects has had two profound effects, neither of them about music. The first is the progressive development of decentralized network models, loosely bundled together under the rubric of peer-to-peer. Though there were several version of such architectures as early as the mid-90s such as ICQ and SETI@Home, it took Napster to ignite general interest in this class of solutions.

And the second effect, of course, is the long-predicted and oft-delayed spread of encryption. The RIAA is succeeding where the Cypherpunks failed, convincing users to trade a broad but penetrable privacy for unbreakable anonymity under their personal control. In contrast to the Cypherpunks “eat your peas” approach, touting encryption as a first-order service users should work to embrace, encryption is now becoming a background feature of collaborative workspaces. Because encryption is becoming something that must run in the background, there is now an incentive to make its adoption as easy and transparent to the user as possible. It’s too early to say how widely casual encryption use will spread, but it isn’t too early to see that the shift is both profound and irreversible.

People will differ on the value of this change, depending on their feelings about privacy and their trust of the Government, but the effects of the increased use of encryption, and the subsequent difficulties for law enforcement in decrypting messages and files, will last far longer than the current transition to digital music delivery, and may in fact be the most important legacy of the current legal crackdown.

The Semantic Web, Syllogism, and Worldview

First published November 7, 2003 on the “Networks, Economics, and Culture” mailing list. 

The W3C’s Semantic Web project has been described in many ways over the last few years: an extension of the current web in which information is given well-defined meaninga place where machines can analyze all the data on the Web, even a Web in which machine reasoning will be ubiquitous and devastatingly powerful. The problem with descriptions this general, however, is that they don’t answer the obvious question: What is the Semantic Web good for?

The simple answer is this: The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where “…certain things being stated, something other than what is stated follows of necessity from their being so.” [Organon]

The canonical syllogism is: 

Humans are mortal
Greeks are human
Therefore, Greeks are mortal

with the third statement derived from the previous two.

The Semantic Web is made up of assertions, e.g. “The creator of shirky.com is Clay Shirky.” Given the two statements

– Clay Shirky is the creator of shirky.com
– The creator of shirky.com lives in Brooklyn

you can conclude that I live in Brooklyn, something you couldn’t know from either statement on its own. From there, other expressions that include Clay Shirky, shirky.com, or Brooklyn can be further coupled.

The Semantic Web specifies ways of exposing these kinds of assertions on the Web, so that third parties can combine them to discover things that are true but not specified directly. This is the promise of the Semantic Web — it will improve all the areas of your life where you currently use syllogisms.

Which is to say, almost nowhere.Syllogisms are Not Very Useful#

Though the syllogism has been around since Aristotle, it reached its apotheosis in the 19th century, in the work of Charles Dodgson (better known as Lewis Carroll.) Dodgson wrote two books of syllogisms and methods for representing them in graphic form, and his syllogisms often took the form of sorites, where the conclusion from one pair of linked assertions becomes a new assertion to be linked to others.

One of Dodgson’s sorites goes:

– Remedies for bleeding, which fail to check it, are a mockery
– Tincture of Calendula is not to be despised
– Remedies, which will check the bleeding when you cut your finger, are useful 
– All mock remedies for bleeding are despicable

which lets you conclude that Tincture of Calendula will check the bleeding when you cut your finger.

Despite their appealing simplicity, syllogisms don’t work well in the real world, because most of the data we use is not amenable to such effortless recombination. As a result, the Semantic Web will not be very useful either.

The people working on the Semantic Web greatly overestimate the value of deductive reasoning (a persistent theme in Artificial Intelligence projects generally.) The great popularizer of this error was Arthur Conan Doyle, whose Sherlock Holmes stories have done more damage to people’s understanding of human intelligence than anyone other than Rene Descartes. Doyle has convinced generations of readers that what seriously smart people do when they think is to arrive at inevitable conclusions by linking antecedent facts. As Holmes famously put it “when you have eliminated the impossible, whatever remains, however improbable, must be the truth.”

This sentiment is attractive precisely because it describes a world simpler than our own. In the real world, we are usually operating with partial, inconclusive or context-sensitive information. When we have to make a decision based on this information, we guess, extrapolate, intuit, we do what we did last time, we do what we think our friends would do or what Jesus or Joan Jett would have done, we do all of those things and more, but we almost never use actual deductive logic.

As a consequence, almost none of the statements we make, even seemingly obvious ones, are true in the way the Semantic Web needs them to be true. Drew McDermott, in his brilliant Critique of Pure Reason [Computational Intelligence, 3:151-237, 1987], took on the notion that you could create Artificial Intelligence by building a sufficiently detailed deductive scaffolding. He concluded that this approach was fatally flawed, noting that “It must be the case that a significant portion of the inferences we want [to make] are deductions, or it will simply be irrelevant how many theorems follow deductively from a given axiom set.” Though Critique of Pure Reason predates not just the Semantic Web but the actual web as well, the criticism still holds.

Consider the following statements:

– The creator of shirky.com lives in Brooklyn
– People who live in Brooklyn speak with a Brooklyn accent

You could conclude from this pair of assertions that the creator of shirky.com pronounces it “shoiky.com.” This, unlike assertions about my physical location, is false. It would be easy to shrug this error off as Garbage In, Garbage Out, but it isn’t so simple. The creator of shirky.com does live in Brooklyn, and some people who live in Brooklyn do speak with a Brooklyn accent, just not all of them (us).

Each of those statements is true, in other words, but each is true in a different way. It is tempting to note that the second statement is a generalization that can only be understood in context, but that way madness lies. Any requirement that a given statement be cross-checked against a library of context-giving statements, which would have still further context, would doom the system to death by scale.We Describe The World In Generalities#

We can’t disallow generalizations because we can’t know which statements are generalizations by looking at them. Even if we could, it wouldn’t help, because generalizations are a fundamental tool of expression. “People who live in France speak French” is structurally no different than ” People who live in Brooklyn speak with a Brooklyn accent.” In any human context “People who live in France speak French” is true, but it is false if universals are required, as there are French immigrants and ex-patriates who don’t speak the language.

Syllogisms sound stilted in part because they traffic in absurd absolutes. Consider this gem from Dodgson:

– No interesting poems are unpopular among people of real taste 
– No modern poetry is free from affectation 
– All your poems are on the subject of soap-bubbles 
– No affected poetry is popular among people of real taste 
– No ancient poetry is on the subject of soap-bubbles

This, of course, allows you to conclude that all your poems are bad. 

This 5-line syllogism is the best critique of the Semantic Web ever published, as its illustrates the kind of world we would have to live in for this form of reasoning to work, a world where language is merely math done with words. Actual human expression must take into account the ambiguities of the real world, where people, even those with real taste, disagree about what is interesting or affected, and where no poets, even the most uninteresting, write all their poems about soap bubbles.The Semantic Web’s Proposed Uses#

Dodgson’s syllogisms actually demonstrate the limitations of the form, a pattern that could be called “proof of no concept”, where the absurdity of an illustrative example undermines the point being made. So it is with the Semantic Web. Consider the following, from the W3C’s own site:

Q: How do you buy a book over the Semantic Web?

A: You browse/query until you find a suitable offer to sell the book you want. You add information to the Semantic Web saying that you accept the offer and giving details (your name, shipping address, credit card information, etc). Of course you add it (1) with access control so only you and seller can see it, and (2) you store it in a place where the seller can easily get it, perhaps the seller’s own server, (3) you notify the seller about it. You wait or query for confirmation that the seller has received your acceptance, and perhaps (later) for shipping information, etc. [http://www.w3.org/2002/03/semweb/]

One doubts Jeff Bezos is losing sleep.

This example sets the pattern for descriptions of the Semantic Web. First, take some well-known problem. Next, misconstrue it so that the hard part is made to seem trivial and the trivial part hard. Finally, congratulate yourself for solving the trivial part.

All the actual complexities of matching readers with books are waved away in the first sentence: “You browse/query until you find a suitable offer to sell the book you want.” Who knew it was so simple? Meanwhile, the trivial operation of paying for it gets a lavish description designed to obscure the fact that once you’ve found a book for sale, using a credit card is a pretty obvious next move.

Consider another description of the Semantic Web that similarly misconstrues the problem:

Merging databases simply becomes a matter of recording in RDF somewhere that “Person Name” in your database is equivalent to “Name” in my database, and then throwing all of the information together and getting a processor to think about it. [http://infomesh.net/2001/swintro/]

No one who has ever dealt with merging databases would use the word ‘simply’. If making a thesaurus of field names were all there was to it, there would be no need for the Semantic Web; this process would work today. Contrariwise, to adopt a Lewis Carroll-ism, the use of hand-waving around the actual problem — human names are not globally unique — masks the triviality of linking Name and Person Name. Is your “Person Name = John Smith” the same person as my “Name = John Q. Smith”? Who knows? Not the Semantic Web. The processor could “think” about this til the silicon smokes without arriving at an answer.

From time to time, proselytizers of the Semantic Web try to give it a human face:

For example, we may want to prove that Joe loves Mary. The way that we came across the information is that we found two documents on a trusted site, one of which said that “:Joe :loves :MJS”, and another of which said that “:MJS daml:equivalentTo :Mary”. We also got the checksums of the files in person from the maintainer of the site.

To check this information, we can list the checksums in a local file, and then set up some FOPL rules that say “if file ‘a’ contains the information Joe loves mary and has the checksum md5:0qrhf8q3hfh, then record SuccessA”, “if file ‘b’ contains the information MJS is equivalent to Mary, and has the checksum md5:0892t925h, then record SuccessB”, and “if SuccessA and SuccessB, then Joe loves Mary”. [http://infomesh.net/2001/swintro/]

You may want to read that second paragraph again, to savor its delicious mix of minutia and cluelessness.

Anyone who has ever been 15 years old knows that protestations of love, checksummed or no, are not to be taken at face value. And even if we wanted to take love out of this example, what would we replace it with? The universe of assertions that Joe might make about Mary is large, but the subset of those assertions that are universally interpretable and uncomplicated is tiny.

One final entry in the proof of no concept category:

Here’s an example: Let’s say one company decides that if someone sells more than 100 of our products, then they are a member of the Super Salesman club. A smart program can now follow this rule to make a simple deduction: “John has sold 102 things, therefore John is a member of the Super Salesman club.” [http://logicerror.com/semanticWeb-long]

This is perhaps perhaps the high water mark of presenting trivial problems as worthy of Semantic intervention: a program that can conclude that 102 is greater than 100 is labeled smart. Artificial Intelligence, here we come.Meta-data is Not A Panacea  #

The Semantic Web runs on meta-data, and much meta-data is untrustworthy, for a variety of reasons that are not amenable to easy solution. (See for example DoctorowPilgrimShirky.) Though at least some of this problem comes from people trying to game the system, the far larger problem is that even when people publish meta-data that they believe to be correct, we still run into trouble.

Consider the following assertions:

– Count Dracula is a Vampire 
– Count Dracula lives in Transylvania 
– Transylvania is a region of Romania 
– Vampires are not real

You can draw only one non-clashing conclusion from such a set of assertions — Romania isn’t real. That’s wrong, of course, but the wrongness is nowhere reflected in these statements. There is simply no way to cleanly separate fact from fiction, and this matters in surprising and subtle ways that relate to matters far more weighty than vampiric identity. Consider these assertions:

– US citizens are people 
– The First Amendment covers the rights of US citizens 
– Nike is protected by the First Amendment

You could conclude from this that Nike is a person, and of course you would be right. In the context of in First Amendment law, corporations are treated as people. If, however, you linked this conclusion with a medical database, you could go on to reason that Nike’s kidneys move poisons from Nike’s bloodstream into Nike’s urine.Ontology is Not A Requirement#

Though proponents of the Semantic Web gamely try to illustrate simple uses for it, the kind of explanatory failures above are baked in, because the Semantic Web is divided between two goals, one good but unnecessary, the other audacious but doomed.

The first goal is simple: get people to use more meta-data. The Semantic Web was one of the earliest efforts to rely on the idea of XML as a common interchange format for data. With such a foundation, making formal agreements about the nature of whatever was being described — an ontology — seemed a logical next step.

Instead, it turns out that people can share data without having to share a worldview, so we got the meta-data without needing the ontology. Exhibit A in this regard is the weblog world. In a recent paper discussing the Semantic Web and weblogs, Matt Rothenberg details the invention and rapid spread of “RSS autodiscovery”, where an existing HTML tag was pressed into service as a way of automatically pointing to a weblog’s syndication feed.

About this process, which went from suggestion to implementation in mere days, Rothenberg says:

Granted, RSS autodiscovery was a relatively simplistic technical standard compared to the types of standards required for the environment of pervasive meta-data stipulated by the semantic web, but its adoption demonstrates an environment in which new technical standards for publishing can go from prototype to widespread utility extremely quickly. [PDF offline, cache here]

This, of course, is the standard Hail Mary play for anyone whose technology is caught on the wrong side of complexity. People pushing such technologies often make the “gateway drug” claim that rapid adoption of simple technologies is a precursor to later adoption of much more complex ones. Lotus claimed that simple internet email would eventually leave people clamoring for the more sophisticated features of CC:Mail (RIP), PointCast (also RIP) tried to label email a “push” technology so they would look like a next-generation tool rather than a dead-end, and so on.

Here Rothenberg follows the script to a tee, labeling RSS autodiscovery ‘simplistic’ without entertaining the idea that simplicity may be a requirement of rapid and broad diffusion. The real lesson of RSS autodiscovery is that developers can create valuable meta-data without needing any of the trappings of the Semantic Web. Were the whole effort to be shelved tomorrow, successes like RSS autodiscovery would not be affected in the slightest.Artificial Intelligence Reborn#

If the sole goal of the Semantic Web were pervasive markup, it would be nothing more than a “Got meta-data?” campaign — a generic exhortation for developers to do what they are doing anyway. The second, and larger goal, however, is to take up the old Artificial Intelligence project in a new context.

After 50 years of work, the performance of machines designed to think about the world the way humans do has remained, to put it politely, sub-optimal. The Semantic Web sets out to address this by reversing the problem. Since it’s hard to make machines think about the world, the new goal is to describe the world in ways that are easy for machines to think about.

Descriptions of the Semantic Web exhibit an inversion of trivial and hard issues because the core goal does as well. The Semantic Web takes for granted that many important aspects of the world can be specified in an unambiguous and universally agreed-on fashion, then spends a great deal of time talking about the ideal XML formats for those descriptions. This puts the stress on the wrong part of the problem — if the world were easy to describe, you could do it in Sanskrit.

Likewise, statements in the Semantic Web work as inputs to syllogistic logic not because syllogisms are a good way to deal with slippery, partial, or context-dependent statements — they are not, for the reasons discussed above — but rather because syllogisms are things computers do well. If the world can’t be reduced to unambiguous statements that can be effortlessly recombined, then it will be hard to rescue the Artificial Intelligence project. And that, of course, would be unthinkable.Worldviews Differ For Good Reasons#

Many networked projects, including things like business-to-business markets and Web Services, have started with the unobjectionable hypothesis that communication would be easier if everyone described things the same way. From there, it is a short but fatal leap to conclude that a particular brand of unifying description will therefore be broadly and swiftly adopted (the “this will work because it would be good if it did” fallacy.)

Any attempt at a global ontology is doomed to fail, because meta-data describes a worldview. The designers of the Soviet library’s cataloging system were making an assertion about the world when they made the first category of books “Works of the classical authors of Marxism-Leninism.” Melvyl Dewey was making an assertion about the world when he lumped all books about non-Christian religions into a single category, listed last among books about religion. It is not possible to neatly map these two systems onto one another, or onto other classification schemes — they describe different kinds of worlds.

Because meta-data describes a worldview, incompatibility is an inevitable by-product of vigorous argument. It would be relatively easy, for example, to encode a description of genes in XML, but it would be impossible to get a universal standard for such a description, because biologists are still arguing about what a gene actually is. There are several competing standards for describing genetic information, and the semantic divergence is an artifact of a real conversation among biologists. You can’t get a standard til you have an agreement, and you can’t force an agreement to exist where none actually does.

Furthermore, when we see attempts to enforce semantics on human situations, it ends up debasing the semantics, rather then making the connection more informative. Social networking services like Friendster and LinkedIn assume that people will treat links to one another as external signals of deep association, so that the social mesh as represented by the software will be an accurate model of the real world. In fact, the concept of friend, or even the type and depth of connection required to say you know someone, is quite slippery, and as a result, links between people on Friendster have been drained of much of their intended meaning. Trying to express implicit and fuzzy relationships in ways that are explicit and sharp doesn’t clarify the meaning, it destroys it.Worse is Better#

In an echo of Richard Gabriel’s Worse is Better argumment, the Semantic Web imagines that completeness and correctness of data exposed on the web are the cardinal virtues, and that any amount of implementation complexity is acceptable in pursuit of those virtues. The problem is that the more semantic consistency required by a standard, the sharper the tradeoff between complexity and scale. It’s easy to get broad agreement in a narrow group of users, or vice-versa, but not both.

The systems that have succeeded at scale have made simple implementation the core virtue, up the stack from Ethernet over Token Ring to the web over gopher and WAIS. The most widely adopted digital descriptor in history, the URL, regards semantics as a side conversation between consenting adults, and makes no requirements in this regard whatsoever: sports.yahoo.com/nfl/ is a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL itself doesn’t have to mean anything is essential — the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the Semantic Web. The Semantic Web’s philosophical argument — the world should make more sense than it does — is hard to argue with. The Semantic Web, with its neat ontologies and its syllogistic logic, is a nice vision. However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.

Much of the proposed value of the Semantic Web is coming, but it is not coming because of the Semantic Web. The amount of meta-data we generate is increasing dramatically, and it is being exposed for consumption by machines as well as, or instead of, people. But it is being designed a bit at a time, out of self-interest and without regard for global ontology. It is also being adopted piecemeal, and it will bring with it with all the incompatibilities and complexities that implies. There are significant disadvantages to this process relative to the shining vision of the Semantic Web, but the big advantage of this bottom-up design and adoption is that it is actually working now.

Social Software and the Politics of Groups

First published March 9, 2003 on the “Networks, Economics, and Culture” mailing list.

Social software, software that supports group communications, includes everything from the simple CC: line in email to vast 3D game worlds like EverQuest, and it can be as undirected as a chat room, or as task-oriented as a wiki (a collaborative workspace). Because there are so many patterns of group interaction, social software is a much larger category than things like groupware or online communities — though it includes those things, not all group communication is business-focused or communal. One of the few commonalities in this big category is that social software is unique to the internet in a way that software for broadcast or personal communications are not.

Prior to the Web, we had hundreds of years of experience with broadcast media, from printing presses to radio and TV. Prior to email, we had hundreds of years experience with personal media — the telegraph, the telephone. But outside the internet, we had almost nothing that supported conversation among many people at once. Conference calling was the best it got — cumbersome, expensive, real-time only, and useless for large groups. The social tools of the internet, lightweight though most of them are, have a kind of fluidity and ease of use that the conference call never attained: compare the effortlessness of CC:ing half a dozen friend to decide on a movie, versus trying to set up a conference call to accomplish the same task.

The radical change was de-coupling groups in space and time. To get a conversation going around a conference table or campfire, you need to gather everyone in the same place at the same moment. By undoing those restrictions, the internet has ushered in a host of new social patterns, from the mailing list to the chat room to the weblog.

The thing that makes social software behave differently than other communications tools is that groups are entities in their own right. A group of people interacting with one another will exhibit behaviors that cannot be predicted by examining the individuals in isolation, peculiarly social effects like flaming and trolling or concerns about trust and reputation. This means that designing software for group-as-user is a problem that can’t be attacked in the same way as designing a word processor or a graphics tool.

Our centuries of experience with printing presses and telegraphs have not prepared us for the design problems we face here. We have had real social software for less than forty years (dated from the Plato system), with less than a decade of general availability. We are still learning how to build and use the software-defined conference tables and campfires we’re gathering around.

Old Systems, Old Assumptions

When the internet was strange and new, we concentrated on its strange new effects. Earlier generations of social software, from mailing lists to MUDs, were created when the network’s population could be measured in the tens of thousands, not the hundreds of millions, and the users were mostly young, male, and technologically savvy. In those days, we convinced ourselves that immersive 3D environments and changing our personalities as often as we changed socks would be the norm.

That period, which ended with the rise of the Web in the early 1990s, was the last time the internet was a global village, and the software built for this environment typically made three assumptions about groups: they could be of any size; anyone should be able to join them; and the freedom of the individual is more important than the goals of the community.

The network is now a global metropolis, vast and heterogeneous, and in this environment groups need protection from too-rapid growth and from being hijacked by anything from off-topic conversations to spam. The communities that thrive in this metropolitan environment violate most or all of the earlier assumptions. Instead of unlimited growth, membership, and freedom, many of the communities that have done well have bounded size or strong limits to growth, non-trivial barriers to joining or becoming a member in good standing, and enforceable community norms that constrain individual freedoms. Forums that lack any mechanism for ejecting or controlling hostile users, especially those convened around contentious topics, have often broken down under the weight of user hostile to the conversation (viz usenet groups like soc.culture.african.american.)

Social Software Encodes Political Bargains

Social interaction creates a tension between the individual and the group. This is true of all social interaction, not just online. Consider, from your own life, that moment where you become bored with a dinner party or other gathering. You lose interest in the event, and then, having decided it is not for you, a remarkable thing happens: you don’t leave. For whatever reason, usually having to do with not wanting to be rude, your dedication to group norms overrides your particular boredom or frustration. This kind of tension between personal goals and group norms arises at some point in most groups.

Any system that supports groups addresses this tension by enacting a simple constitution — a set of rules governing the relationship between individuals and the group. These constitutions usually work by encouraging or requiring certain kinds of interaction, and discouraging or forbidding others. Even the most anarchic environments, where “Do as thou wilt” is the whole of the law, are making a constitutional statement. Social software is political science in executable form.

Different constitutions encode different bargains. Slashdot’s core principle, for example, is “No censorship”; anyone should be able to comment in any way on any article. Slashdot’s constitution (though it is not called that) specifies only three mechanisms for handling the tension between individual freedom to post irrelevant or offensive material, and the group’s desire to be able to find the interesting comments. The first is moderation, a way of convening a jury pool of members in good standing, whose function is to rank those posts by quality. The second is meta-moderation, a way of checking those moderators for bias, as a solution to the “Who will watch the watchers?” problem. And the third is karma, a way of defining who is a member in good standing. These three political concepts, lightweight as they are, allow Slashdot to grow without becoming unusable.

The network abounds with different political strategies, like Kuro5hin’s distributed editorial function, LiveJournal’s invitation codes, MetaFilter’s closing off of user signups during population surges, Joel Spolsky’s design principles for the Joel on Software forum, or the historical reactions of earlier social spaces like LambdaMOO or Habitat to constitutional crises are all ways of responding to the fantastically complex behavior of groups. The variables include different effects at different scales (imagine the conversation at a dinner for 6, 60, and 600), the engagement of the users, and the degree to which participants feel themselves to be members of a group with formal goals.

Further complicating all of this are the feedback loops created when a group changes its behavior in response to changes in software. Because of these effects, designers of social software have more in common with economists or political scientists than they do with designers of single-user software, and operators of communal resources have more in common with politicians or landlords than with operators of ordinary web sites.

Testing Group Experience

Social software has progressed far less quickly than single-user software, in part because we have a much better idea of how to improve user experience than group experience, and a much better idea of how to design interfaces than constitutions. While word processors and graphics editors have gotten significantly better over the years, the features for mailing lists are not that different from the original LISTSERV program in 1985. In fact, most of the work on mailing list software has been around making it easier to set up and administer, rather than making it easier for the group using the software to accomplish anything.

We have lots of interesting examples of social software, from the original SF-LOVERS mailing list, which ifrst appeared in 1970s and outlived all the hardware of the network it launched on, to the Wikipedia, a giant community-created encyclopedia. Despite a wealth of examples, however, we don’t have many principles derived from those examples other than “No matter how much the administrators say its ‘for work’, people will bend communications tools to social uses” or “It sure is weird that the Wikipedia works.” We have historically overestimated the value of network access to computers, and underestimated the value of network access to other people, so we have spent much more time on the technical rather than social problems of software used by groups.

One fruitful question might be “How can we test good group experience?” Over the last several years, the importance of user experience, user testing, and user feedback have become obvious, but we have very little sense of group experience, group testing, or group feedback. If a group uses software that encourages constant forking of topics, so that conversations become endless and any given conversation peters out rather than being finished, each participant might enjoy the conversation, but the software may be harming the group goal by encouraging tangents rather than focus.

If a group has a goal, how can we understand the way the software supports that goal? This is a complicated question, not least because the conditions that foster good group work, such as clear decision- making process, may well upset some of the individual participants. Most of our methods for soliciting user feedback assume, usually implicitly, that the individual’s reaction to the software is the critical factor. This tilts software and interface design towards single-user assumptions, even when the software’s most important user is a group.

Barriers

Another critical question: “What kind of barriers work best?” Most groups have some sort of barrier to group membership, which can be thought of as a membrane separating the group from the rest of the world. Sometimes it is as simple as the energy required to join a mailing list. Sometimes it is as complicated as getting a sponsor within the group, or acquiring a password or key. Sometimes the membrane is binary and at the edge of the group — you’re on the mailing list or not. Sometimes its gradiated and internal, as with user identity and karma on Slashdot. Given the rich history we have with such social membranes, can we draw any general conclusions about their use by analyzing successes (or failures) in existing social software?

There are thousands of other questions. Can we produce diagrams of social networks in real time, so the participants in a large group can be aware of conversational clusters as they are forming? What kind of feedback loops will this create? Will software that lets groups form with a pre-set dissolution date (“This conversation good until 08/01/2003.”) help groups focus? Can we do anything to improve the online environment for brainstorming? Negotiation? Decision making? Can Paypal be integrated into group software, so that groups can raise and disperse funds in order to pursue their goals? (Even Boy Scouts do this in the real world, but it’s almost unheard of online.) And so on.

The last time there was this much foment around the idea of software to be used by groups was in the late 70s, when usenet, group chat, and MUDs were all invented in the space of 18 months. Now we’ve got blogs, wikis, RSS feeds, Trackback, XML-over-IM and all sorts of IM- and mail-bots. We’ve also got a network population that’s large, heterogeneous, and still growing rapidly. The conversations we can have about social software can be advanced by asking ourselves the right questions about both the software and the political bargains between users and the group that software will encode or enforce.

File-sharing Goes Social

First published October 12, 2003 on the “Networks, Economics, and Culture” mailing list.

The RIAA has taken us on a tour of networking strategies in the last few years, by constantly changing the environment file-sharing systems operate in. In hostile environments, organisms often adapt to become less energetic but harder to kill, and so it is now. With the RIAA’s waves of legal attacks driving experimentation with decentralized file-sharing tools, file-sharing networks have progressively traded efficiency for resistance to legal attack.

The RIAA has slowly altered the environment so that relatively efficient systems like Napster were killed, opening up a niche for more decentralized systems like Gnutella and Kazaa. With their current campaign against Kazaa in full swing, we are about to see another shift in network design, one that will have file sharers adopting tools originally designed for secure collaboration in a corporate setting.

Napster’s problem, of course, was that although Napster nodes acted as both client and server, the central database still gave the RIAA a single target. Seeing this, Gnutella and Kazaa shifted to a mesh of nodes that could each act as client, server, and router. These networks are self-assembling and self-reconfiguring with a minimum of bootstrapping, and decentralize even addresses and pointers to files.

The RIAA is now attacking these networks using a strategy that could be called Crush the Connectors. A number of recent books on networks, such as Gladwell’s The Tipping Point, Barabasi’s Linked, and Watts’ Six Degrees, have noted that large, loosely connected networks derive their effectiveness from a small number of highly connected nodes, a pattern called a Small World network. As a result, random attacks, even massive ones, typically leave the network only modestly damaged.

The flipside is that attacks that specifically target the most connected nodes are disproportionately effective. The RIAA’s Crush the Connectors strategy will work, not simply because highly publicized legal action will deter some users, but because the value of the system will decay badly if the RIAA succeeds in removing even a small number of the best-provisioned nodes.

However, it will not work as well as the RIAA wants, even ignoring the public relations fallout, for two reasons. The first is that combining client, server, and router in one piece of software is not the last move available to network designers — there is still the firewall. And the second is simply the math of popular music — there are more people than songs.

Networks, Horizons, and Membranes

Napster was the last file-sharing system that was boundary-less by design. There was, at least in theory, one Napster universe at any given moment, and it was globally searchable. Gnutella, Kazaa, and other similar systems set out to decentralize even the address and search functions. This made these systems more robust in the face of legal challenges, but added an internal limit — the search horizon. 

Since such systems have no central database, they relay requests through the system from one node to the next. However, the “Ask two friends to ask two friends ad infinitum” search method can swamp the system. As a result, these systems usually limit the spread of search requests, creating an internal horizon. The tradeoff here is between the value of any given search (deeper searches are more effective) vs the load on the system as a whole (shallower searches reduce communications overhead.) In a world where the RIAA’s attack mode was to go after central resources, this tradeoff worked well — efficient enough, and resistant to Napster-style lawsuits. 

However, these systems are themselves vulnerable in two ways — first, anything that reduces the number of songs inside any given user’s search horizon reduces the value of the system, causing some users to defect, which weakens the system still further. Second, because search horizons are only perceptual borders, the activity of the whole network can be observed by a determined attacker running multiple nodes as observation points. The RIAA is relying on both weaknesses in its current attack.

By working to remove those users who make a large number of files persistently available, the RIAA can limit the amount of accessible music and the trust the average user has in the system. Many of the early reports on the Crush the Connectors strategy suggest that users are not just angry with the RIAA, but with Kazaa as well, for failing to protect them.

The very fact that Crush the Connectors is an attack on trustworthiness, however, points to one obvious reaction: move from a system with search horizons to one with real membranes, and making those membranes social as well as technological.

Trust as a Border 

There are several activities that are both illegal and popular, and these suffer from what economists call high transaction costs. Buying marijuana involves considerably more work than buying roses, in part because every transaction involves risk for both parties, and in part because neither party can rely on the courts for redress from unfair transactions. As a result, the market for marijuana today (or NYC tattoo artists in the 1980s, or gin in the 1920s, etc) involves trusted intermediaries who broker introductions. 

These intermediaries act as a kind of social Visa system; in the same way a credit card issuer has a relationship with both buyer and seller, and an incentive to see that transactions go well, an introducer in an illegal transaction has an incentive to make sure that neither side defects from the transaction. And all parties, of course, have an incentive to avoid detection.

This is a different kind of border than a search horizon. Instead of being able to search for resources a certain topological distance from you, you search for resources a certain social distance from you. (This is also the guiding principle behind services like LinkedIn and Friendster, though in practice they represent their user’s networks as being much larger than real-world social boundaries are.)

Such a system would add a firewall of sorts to the client, server, and router functions of existing systems, and that firewall would serve two separate but related needs. It would make the shared space inaccessible to new users without some sort of invitation from existing users, and it would likewise make all activity inside the space unobservable to the outside world. 

Though the press is calling such systems “darknets” and intimating that they are the work of some sort of internet underground, those two requirements — controlled membership and encrypted file transfer — actually describe business needs better than consumer needs. 

There are many ways to move to such membrane-bounded systems, of course, including retrofitting existing networks to allow sub-groups with controlled membership (possibly using email white-list or IM buddy-list tools); adopting any of the current peer-to-peer tools designed for secure collaboration (e.g. GrooveShinkuroWASTE etc); or even going to physical distribution. As Andrew Odlyzko has pointed out, sending disks through the mail can move enough bits in a 24 hour period to qualify as broadband, and there are now file-sharing networks whose members simply snail mail one another mountable drives of music.

A critical factor here is the social fabric — as designers of secure networks know, protecting the perimeter of a network only works if the people inside the perimeter are trustworthy. New entrants can only be let into such a system if they are somehow vetted or vouched for, and the existing members must have something at stake in the behavior of the new arrivals. 

The disadvantage of social sharing is simple — limited membership means fewer files. The advantage is equally simple — a socially bounded system is more effective than nothing, and safer than Kazaa. 

If Kazaa, Gnutella and others are severely damaged by the Crush the Connectors attack, users will either give up free file-sharing, or switch to less efficient social spaces. This might seem like an unalloyed win for the RIAA, but for one inconvenient fact: there are more people than are songs.

There Are More People Than Songs

For the sake of round numbers, assume there are 500 million people using the internet today, and that much of the world’s demand for popular music would be satisfied by the availability of something like 5 million individual songs (Apple’s iTunes, by way of comparison, is a twentieth of that size.) Because people outnumber songs, if every user had one MP3 each, there would be a average of a hundred copies of every song somewhere online. A more realistic accounting would assume that at least 10% of the online population had at least 10 MP3 files each, numbers that are both underestimates, given the popularity of both ripping and sharing music.

Worse for the RIAA, the popularity of songs is wildly unequal. Some songs — The Real Slim Shady, Come Away With Me — exist on millions of hard drives around the world. As we’ve moved from more efficient systems like Napster to less efficient ones like Kazaa, it has become considerably harder to find bluegrass, folk, or madrigals, but not that much harder to find songs by Britney, 50 Cent, or John Mayer. And as with the shift from Napster to Kazaa, the shift from Kazaa to socially-bounded systems will have the least significant effect on the most popular music.

The worst news of all, though, is that songs are not randomly distributed. Instead, user clusters are a good predictor of shared taste. Make two lists, one of your favorite people and another of your favorite songs. What percentage of those songs could you copy from those people?

Both of those lists are probably in the dozens at most, and if music were randomly distributed, getting even a few of your favorite songs from your nearest and dearest would be a rare occurrence. As it is, though, you could probably get a significant percentage of your favorite songs from your favorite people. Systems that rely on small groups of users known to one another, trading files among themselves, will be less efficient than Kazaa or Napster, but far more efficient than a random distribution of music would suggest.

What Happens Next?

Small amounts of social file-sharing, by sending files as email attachments or uploading them to personal web servers, have always co-existed with the purpose-built file-sharing networks, but the two patterns may fuse as a result of the Crush the Connectors strategy. If that transition happens on a large scale, what might the future look like?

Most file-sharing would go on in groups from a half dozen to a few dozen — small enough that every member can know every other member by reputation. Most file-sharing would take place in the sorts of encrypted workspaces designed for business but adapted for this sort of social activity. Some users would be members of more than one space, thus linking several cells of users. The system would be far less densely interconnected than Kazaa or Gnutella are today, but would be more tightly connected than a simple set of social cells operating in isolation.

It’s not clear whether this would be good news or bad news for the RIAA. There are obviously several reasons to think it might be bad news: file-sharing would take place in spaces that would be much harder to inspect or penetrate; the lowered efficiency would also mean fewer high-yield targets for legal action; and the use of tools by groups that knew one another might make prosecution more difficult, because copyright law has often indemnified some types of non-commercial sharing among friends (e.g. the Audio Home Recording Act of 1992).

There is also good news that could come from such social sharing systems, however. Reduced efficiency might send many users into online stores, and users seeking the hot new song might be willing to buy them online rather than wait for the files to arrive through social diffusion, which would effectively turn at least some of these groups into buyers clubs.

The RIAA’s reaction to such social sharing will be unpredictable. They have little incentive to seek solutions that don’t try to make digital files behave like physical objects. They may therefore reason that they have little to lose by attacking social sharing systems with a vengeance. Whatever their reaction, however, it is clear that the current environment favors the development and adoption of social and collaborative tools, which will go on to have effects well outside the domain of file-sharing, because once a tool is adopted for one purpose, it often takes on a life of its own, as its users press such social tools to new uses.

Fame vs Fortune: Micropayments and Free Content

First published September 5, 2003 on the “Networks, Economics, and Culture” mailing list.

Micropayments, small digital payments of between a quarter and a fraction of a penny, made (yet another) appearance this summer with Scott McCloud’s online comic, The Right Number, accompanied by predictions of a rosy future for micropayments.

To read The Right Number, you have to sign up for the BitPass micropayment system; once you have an account, the comic itself costs 25 cents.

BitPass will fail, as FirstVirtual, Cybercoin, Millicent, Digicash, Internet Dollar, Pay2See, and many others have in the decade since Digital Silk Road, the paper that helped launch interest in micropayments. These systems didn’t fail because of poor implementation; they failed because the trend towards freely offered content is an epochal change, to which micropayments are a pointless response.

The failure of BitPass is not terribly interesting in itself. What is interesting is the way the failure of micropayments, both past and future, illustrates the depth and importance of putting publishing tools in the hands of individuals. In the face of a force this large, user-pays schemes can’t simply be restored through minor tinkering with payment systems, because they don’t address the cause of that change — a huge increase the power and reach of the individual creator.

Why Micropayment Systems Don’t Work

The people pushing micropayments believe that the dollar cost of goods is the thing most responsible for deflecting readers from buying content, and that a reduction in price to micropayment levels will allow creators to begin charging for their work without deflecting readers.

This strategy doesn’t work, because the act of buying anything, even if the price is very small, creates what Nick Szabo calls mental transaction costs, the energy required to decide whether something is worth buying or not, regardless of price. The only business model that delivers money from sender to receiver with no mental transaction costs is theft, and in many ways, theft is the unspoken inspiration for micropayment systems.

Like the salami slicing exploit in computer crime, micropayment believers imagine that such tiny amounts of money can be extracted from the user that they will not notice, while the overall volume will cause these payments to add up to something significant for the recipient. But of course the users do notice, because they are being asked to buy something. Mental transaction costs create a minimum level of inconvenience that cannot be removed simply by lowering the dollar cost of goods.

Worse, beneath a certain threshold, mental transaction costs actually rise, a phenomenon is especially significant for information goods. It’s easy to think a newspaper is worth a dollar, but is each article worth half a penny? Is each word worth a thousandth of a penny? A newspaper, exposed to the logic of micropayments, becomes impossible to value.

If you want to feel mental transaction costs in action, sign up for the $3 version of BitPass, then survey the content on offer. Would you pay 25 cents to view a VR panorama of the Matterhorn? Are Powerpoint slides on “Ten reasons why now is a great time to start a company?” worth a dime? (and if so, would each individual reason be worth a penny?)

Mental transaction costs help explain the general failure of micropayment systems. (See OdlyzkoShirky, and Szabo for a fuller accounting of the weaknesses of micropayments.) The failure of micropayments in turn helps explain the ubiquity of free content on the Web.

Fame vs Fortune and Free Content

Analog publishing generates per-unit costs — each book or magazine requires a certain amount of paper and ink, and creates storage and transportation costs. Digital publishing doesn’t. Once you have a computer and internet access, you can post one weblog entry or one hundred, for ten readers or ten thousand, without paying anything per post or per reader. In fact, dividing up front costs by the number of readers means that content gets cheaper as it gets more popular, the opposite of analog regimes.

The fact that digital content can be distributed for no additional cost does not explain the huge number of creative people who make their work available for free. After all, they are still investing their time without being paid back. Why?

The answer is simple: creators are not publishers, and putting the power to publish directly into their hands does not make them publishers. It makes them artists with printing presses. This matters because creative people crave attention in a way publishers do not. Prior to the internet, this didn’t make much difference. The expense of publishing and distributing printed material is too great for it to be given away freely and in unlimited quantities — even vanity press books come with a price tag. Now, however, a single individual can serve an audience in the hundreds of thousands, as a hobby, with nary a publisher in sight.

This disrupts the old equation of “fame and fortune.” For an author to be famous, many people had to have read, and therefore paid for, his or her books. Fortune was a side-effect of attaining fame. Now, with the power to publish directly in their hands, many creative people face a dilemma they’ve never had before: fame vs fortune.

Substitutability and the Deflection of Use

The fame vs fortune choice matters because of substitutability, the willingness to accept one thing as a substitute for another. Substitutability is neutralized in perfect markets. For example, if someone has even a slight preference for Pepsi over Coke, and if both are always equally available in all situations, that person will never drink a Coke, despite being only mildly biased.

The soft-drink market is not perfect, but the Web comes awfully close: If InstaPundit and Samizdata are both equally easy to get to, the relative traffic to the sites will always match audience preference. But were InstaPundit to become less easy to get to, Samizdata would become a more palatable substitute. Any barrier erodes the user’s preferences, and raises their willingness to substitute one thing for another.

This is made worse by the asymmetry between the author’s motivation and the reader’s. While the author has one particular thing they want to write, the reader is usually willing to read anything interesting or relevant to their interests. Though each piece of written material is unique, the universe of possible choices for any given reader is so vast that uniqueness is not a rare quality. Thus any barrier to a particular piece of content (even, as the usability people will tell you, making it one click further away) will deflect at least some potential readers.

Charging, of course, creates just such a barrier. The fame vs fortune problem exists because the web makes it possible to become famous without needing a publisher, and because any attempt to derive fortune directly from your potential audience lowers the size of that audience dramatically, as the added cost encourages them to substitute other, free sources of content.

Free is a Stable Strategy

For a creator more interested in attention than income, free makes sense. In a regime where most of the participants are charging, freeing your content gives you a competitive advantage. And, as the drunks say, you can’t fall off the floor. Anyone offering content free gains an advantage that can’t be beaten, only matched, because the competitive answer to free — “I’ll pay you to read my weblog!” — is unsupportable over the long haul.

Free content is thus what biologists call an evolutionarily stable strategy. It is a strategy that works well when no one else is using it — it’s good to be the only person offering free content. It’s also a strategy that continues to work if everyone is using it, because in such an environment, anyone who begins charging for their work will be at a disadvantage. In a world of free content, even the moderate hassle of micropayments greatly damages user preference, and increases their willingness to accept free material as a substitute.

Furthermore, the competitive edge of free content is increasing. In the 90s, as the threat the Web posed to traditional publishers became obvious, it was widely believed that people would still pay for filtering. As the sheer volume of free content increased, the thinking went, finding the good stuff, even if it was free, would be worth paying for because it would be so hard to find.

In fact, the good stuff is becoming easier to find as the size of the system grows, not harder, because collaborative filters like Google and Technorati rely on rich link structure to sort through links. So offering free content is not just an evolutionary stable strategy, it is a strategy that improves with time, because the more free content there is the greater the advantage it has over for-fee content.

The Simple Economics of Content

People want to believe in things like micropayments because without a magic bullet to believe in, they would be left with the uncomfortable conclusion that what seems to be happening — free content is growing in both amount and quality — is what’s actually happening.

The economics of content creation are in fact fairly simple. The two critical questions are “Does the support come from the reader, or from an advertiser, patron, or the creator?” and “Is the support mandatory or voluntary?”

The internet adds no new possibilities. Instead, it simply shifts both answers strongly to the right. It makes all user-supported schemes harder, and all subsidized schemes easier. It likewise makes collecting fees harder, and soliciting donations easier. And these effects are multiplicative. The internet makes collecting mandatory user fees much harder, and makes voluntarily subsidy much easier.

Weblogs, in particular, represent a huge victory for voluntarily subsidized content. The weblog world is driven by a million creative people, driven to get the word out, willing to donate their work, and unhampered by the costs of xeroxing, ink, or postage. Given the choice of fame vs fortune, many people will prefer a large audience and no user fees to a small audience and tiny user fees. This is not to say that creators cannot be paid for their work, merely that mandatory user fees are far less effective than voluntary donations, sponsorship, or advertising.

Because information is hard to value in advance, for-fee content will almost invariably be sold on a subscription basis, rather than per piece, to smooth out the variability in value. Individual bits of content that are even moderately close in quality to what is available free, but wrapped in the mental transaction costs of micropayments, are doomed to be both obscure and unprofitable.

What’s Next?

This change in the direction of free content is strongest for the work of individual creators, because an individual can produce material on any schedule they like. It is also strongest for publication of words and images, because these are the techniques most easily mastered by individuals. As creative work in groups creates a good deal of organizational hassle and often requires a particular mix of talents, it remains to be seen how strongly the movement towards free content will be for endeavors like music or film.

However, the trends are towards easier collaboration, and still more power to the individual. The open source movement has demonstrated that even phenomenally complex systems like Linux can be developed through distributed volunteer labor, and software like Apple’s iMovie allows individuals to do work that once required a team. So while we don’t know what ultimate effect the economics of free content will be on group work, we do know that the barriers to such free content are coming down, as they did with print and images when the Web launched.

The interesting questions regarding free content, in other words, have nothing to do with bland “End of Free” predictions, or unimaginative attempts at restoring user-pays regimes. The interesting questions are how far the power of the creator to publish their own work is going to go, how much those changes will be mirrored in group work, and how much better collaborative filters will become in locating freely offered material. While we don’t know what the end state of these changes will be, we do know that the shift in publishing power is epochal and accelerating.

A Group Is Its Own Worst Enemy

A speech at ETech, April, 2003; published July 1, 2003 on the “Networks, Economics, and Culture” mailing list.

This is a lightly edited version of the keynote I gave on Social Software at the O’Reilly Emerging Technology conference in Santa Clara on April 24, 2003
Good morning, everybody. I want to talk this morning about social software …there’s a surprise. I want to talk about a pattern I’ve seen over and over again in social software that supports large and long-lived groups. And that pattern is the pattern described in the title of this talk: “A Group Is Its Own Worst Enemy.”

In particular, I want to talk about what I now think is one of the core challenges for designing large-scale social software. Let me offer a definition of social software, because it’s a term that’s still fairly amorphous. My definition is fairly simple: It’s software that supports group interaction. I also want to emphasize, although that’s a fairly simple definition, how radical that pattern is. The Internet supports lots of communications patterns, principally point-to-point and two-way, one-to-many outbound, and many-to-many two-way.

Prior to the Internet, we had lots of patterns that supported point-to-point two-way. We had telephones, we had the telegraph. We were familiar with technological mediation of those kinds of conversations. Prior to the Internet, we had lots of patterns that supported one-way outbound. I could put something on television or the radio, I could publish a newspaper. We had the printing press. So although the Internet does good things for those patterns, they’re patterns we knew from before.

Prior to the Internet, the last technology that had any real effect on the way people sat down and talked together was the table. There was no technological mediation for group conversations. The closest we got was the conference call, which never really worked right — “Hello? Do I push this button now? Oh, shoot, I just hung up.” It’s not easy to set up a conference call, but it’s very easy to email five of your friends and say “Hey, where are we going for pizza?” So ridiculously easy group forming is really news.

We’ve had social software for 40 years at most, dated from the Plato BBS system, and we’ve only had 10 years or so of widespread availability, so we’re just finding out what works. We’re still learning how to make these kinds of things.

Now, software that supports group interaction is a fundamentally unsatisfying definition in many ways, because it doesn’t point to a specific class of technology. If you look at email, it obviously supports social patterns, but it can also support a broadcast pattern. If I’m a spammer, I’m going to mail things out to a million people, but they’re not going to be talking to one another, and I’m not going to be talking to them — spam is email, but it isn’t social. If I’m mailing you, and you’re mailing me back, we’re having point-to-point and two-way conversation, but not one that creates group dynamics.

So email doesn’t necessarily support social patterns, group patterns, although it can. Ditto a weblog. If I’m Glenn Reynolds, and I’m publishing something with Comments Off and reaching a million users a month, that’s really broadcast. It’s interesting that I can do it as a single individual, but the pattern is closer to MSNBC than it is to a conversation. If it’s a cluster of half a dozen LiveJournal users, on the other hand, talking about their lives with one another, that’s social. So, again, weblogs are not necessarily social, although they can support social patterns.

Nevertheless, I think that definition is the right one, because it recognizes the fundamentally social nature of the problem. Groups are a run-time effect. You cannot specify in advance what the group will do, and so you can’t substantiate in software everything you expect to have happen.

Now, there’s a large body of literature saying “We built this software, a group came and used it, and they began to exhibit behaviors that surprised us enormously, so we’ve gone and documented these behaviors.” Over and over and over again this pattern comes up. (I hear Stewart [Brand, of the WELL] laughing.) The WELL is one of those places where this pattern came up over and over again.

This talk is in three parts. The best explanation I have found for the kinds of things that happen when groups of humans interact is psychological research that predates the Internet, so the first part is going to be about W.R. Bion’s research, which I will talk about in a moment, research that I believe explains how and why a group is its own worst enemy.

The second part is: Why now? What’s going on now that makes this worth thinking about? I think we’re seeing a revolution in social software in the current environment that’s really interesting. 

And third, I want to identify some things, about half a dozen things, in fact, that I think are core to any software that supports larger, long-lived groups. 

Part One: How is a group its own worst enemy?

So, Part One. The best explanation I have found for the ways in which this pattern establishes itself, the group is its own worst enemy, comes from a book by W.R. Bion called “Experiences in Groups,” written in the middle of the last century.

Bion was a psychologist who was doing group therapy with groups of neurotics. (Drawing parallels between that and the Internet is left as an exercise for the reader.) The thing that Bion discovered was that the neurotics in his care were, as a group, conspiring to defeat therapy. 

There was no overt communication or coordination. But he could see that whenever he would try to do anything that was meant to have an effect, the group would somehow quash it. And he was driving himself crazy, in the colloquial sense of the term, trying to figure out whether or not he should be looking at the situation as: Are these individuals taking action on their own? Or is this a coordinated group?

He could never resolve the question, and so he decided that the unresolvability of the question was the answer. To the question: Do you view groups of people as aggregations of individuals or as a cohesive group, his answer was: “Hopelessly committed to both.”

He said that humans are fundamentally individual, and also fundamentally social. Every one of us has a kind of rational decision-making mind where we can assess what’s going on and make decisions and act on them. And we are all also able to enter viscerally into emotional bonds with other groups of people that transcend the intellectual aspects of the individual. 

In fact, Bion was so convinced that this was the right answer that the image he put on the front cover of his book was a Necker cube, one of those cubes that you can look at and make resolve in one of two ways, but you can never see both views of the cube at the same time. So groups can be analyzed both as collections of individuals and having this kind of emotive group experience. 

Now, it’s pretty easy to see how groups of people who have formal memberships, groups that have been labeled and named like “I am a member of such-and-such a guild in a massively multi-player online role-playing game,” it’s easy to see how you would have some kind of group cohesion there. But Bion’s thesis is that this effect is much, much deeper, and kicks in much, much sooner than many of us expect. So I want to illustrate this with a story, and to illustrate the illustration, I’ll use a story from your life. Because even if I don’t know you, I know what I’m about to describe has happened to you. 

You are at a party, and you get bored. You say “This isn’t doing it for me anymore. I’d rather be someplace else. I’d rather be home asleep. The people I wanted to talk to aren’t here.” Whatever. The party fails to meet some threshold of interest. And then a really remarkable thing happens: You don’t leave. You make a decision “I don’t like this.” If you were in a bookstore and you said “I’m done,” you’d walk out. If you were in a coffee shop and said “This is boring,” you’d walk out.

You’re sitting at a party, you decide “I don’t like this; I don’t want to be here.” And then you don’t leave. That kind of social stickiness is what Bion is talking about. 

And then, another really remarkable thing happens. Twenty minutes later, one person stands up and gets their coat, and what happens? Suddenly everyone is getting their coats on, all at the same time. Which means that everyone had decided that the party was not for them, and no one had done anything about it, until finally this triggering event let the air out of the group, and everyone kind of felt okay about leaving.

This effect is so steady it’s sometimes called the paradox of groups. It’s obvious that there are no groups without members. But what’s less obvious is that there are no members without a group. Because what would you be a member of? 

So there’s this very complicated moment of a group coming together, where enough individuals, for whatever reason, sort of agree that something worthwhile is happening, and the decision they make at that moment is: This is good and must be protected. And at that moment, even if it’s subconscious, you start getting group effects. And the effects that we’ve seen come up over and over and over again in online communities.

Now, Bion decided that what he was watching with the neurotics was the group defending itself against his attempts to make the group do what they said they were supposed to do. The group was convened to get better, this group of people was in therapy to get better. But they were defeating that. And he said, there are some very specific patterns that they’re entering into to defeat the ostensible purpose of the group meeting together. And he detailed three patterns.

The first is sex talk, what he called, in his mid-century prose, “A group met for pairing off.” And what that means is, the group conceives of its purpose as the hosting of flirtatious or salacious talk or emotions passing between pairs of members.

You go on IRC and you scan the channel list, and you say “Oh, I know what that group is about, because I see the channel label.” And you go into the group, you will also almost invariably find that it’s about sex talk as well. Not necessarily overt. But that is always in scope in human conversations, according to Bion. That is one basic pattern that groups can always devolve into, away from the sophisticated purpose and towards one of these basic purposes.

The second basic pattern that Bion detailed: The identification and vilification of external enemies. This is a very common pattern. Anyone who was around the Open Source movement in the mid-Nineties could see this all the time. If you cared about Linux on the desktop, there was a big list of jobs to do. But you could always instead get a conversation going about Microsoft and Bill Gates. And people would start bleeding from their ears, they would get so mad.

If you want to make it better, there’s a list of things to do. It’s Open Source, right? Just fix it. “No, no, Microsoft and Bill Gates grrrrr …”, the froth would start coming out. The external enemy — nothing causes a group to galvanize like an external enemy.

So even if someone isn’t really your enemy, identifying them as an enemy can cause a pleasant sense of group cohesion. And groups often gravitate towards members who are the most paranoid and make them leaders, because those are the people who are best at identifying external enemies.

The third pattern Bion identified: Religious veneration. The nomination and worship of a religious icon or a set of religious tenets. The religious pattern is, essentially, we have nominated something that’s beyond critique. You can see this pattern on the Internet any day you like. Go onto a Tolkein newsgroup or discussion forum, and try saying “You know, The Two Towers is a little dull. I mean loooong. We didn’t need that much description about the forest, because it’s pretty much the same forest all the way.”

Try having that discussion. On the door of the group it will say: “This is for discussing the works of Tolkein.” Go in and try and have that discussion. 

Now, in some places people say “Yes, but it needed to, because it had to convey the sense of lassitude,” or whatever. But in most places you’ll simply be flamed to high heaven, because you’re interfering with the religious text.

So these are human patterns that have shown up on the Internet, not because of the software, but because it’s being used by humans. Bion has identified this possibility of groups sandbagging their sophisticated goals with these basic urges. And what he finally came to, in analyzing this tension, is that group structure is necessary. Robert’s Rules of Order are necessary. Constitutions are necessary. Norms, rituals, laws, the whole list of ways that we say, out of the universe of possible behaviors, we’re going to draw a relatively small circle around the acceptable ones.

He said the group structure is necessary to defend the group from itself. Group structure exists to keep a group on target, on track, on message, on charter, whatever. To keep a group focused on its own sophisticated goals and to keep a group from sliding into these basic patterns. Group structure defends the group from the action of its own members. 

In the Seventies — this is a pattern that’s shown up on the network over and over again — in the Seventies, a BBS called Communitree launched, one of the very early dial-up BBSes. This was launched when people didn’t own computers, institutions owned computers.

Communitree was founded on the principles of open access and free dialogue. “Communitree” — the name just says “California in the Seventies.” And the notion was, effectively, throw off structure and new and beautiful patterns will arise.

And, indeed, as anyone who has put discussion software into groups that were previously disconnected has seen, that does happen. Incredible things happen. The early days of Echo, the early days of usenet, the early days of Lucasfilms Habitat, over and over again, you see all this incredible upwelling of people who suddenly are connected in ways they weren’t before.

And then, as time sets in, difficulties emerge. In this case, one of the difficulties was occasioned by the fact that one of the institutions that got hold of some modems was a high school. And who, in 1978, was hanging out in the room with the computer and the modems in it, but the boys of that high school. And the boys weren’t terribly interested in sophisticated adult conversation. They were interested in fart jokes. They were interested in salacious talk. They were interested in running amok and posting four-letter words and nyah-nyah-nyah, all over the bulletin board.

And the adults who had set up Communitree were horrified, and overrun by these students. The place that was founded on open access had too much open access, too much openness. They couldn’t defend themselves against their own users. The place that was founded on free speech had too much freedom. They had no way of saying “No, that’s not the kind of free speech we meant.”

But that was a requirement. In order to defend themselves against being overrun, that was something that they needed to have that they didn’t have, and as a result, they simply shut the site down.

Now you could ask whether or not the founders’ inability to defend themselves from this onslaught, from being overrun, was a technical or a social problem. Did the software not allow the problem to be solved? Or was it the social configuration of the group that founded it, where they simply couldn’t stomach the idea of adding censorship to protect their system. But in a way, it doesn’t matter, because technical and social issues are deeply intertwined. There’s no way to completely separate them.

What matters is, a group designed this and then was unable, in the context they’d set up, partly a technical and partly a social context, to save it from this attack from within. And attack from within is what matters. Communitree wasn’t shut down by people trying to crash or syn-flood the server. It was shut down by people logging in and posting, which is what the system was designed to allow. The technological pattern of normal use and attack were identical at the machine level, so there was no way to specify technologically what should and shouldn’t happen. Some of the users wanted the system to continue to exist and to provide a forum for discussion. And other of the users, the high school boys, either didn’t care or were actively inimical. And the system provided no way for the former group to defend itself from the latter.

Now, this story has been written many times. It’s actually frustrating to see how many times it’s been written. You’d hope that at some point that someone would write it down, and they often do, but what then doesn’t happen is other people don’t read it.

The most charitable description of this repeated pattern is “learning from experience.” But learning from experience is the worst possible way to learn something. Learning from experience is one up from remembering. That’s not great. The best way to learn something is when someone else figures it out and tells you: “Don’t go in that swamp. There are alligators in there.” 

Learning from experience about the alligators is lousy, compared to learning from reading, say. There hasn’t been, unfortunately, in this arena, a lot of learning from reading. And so, lessons from Lucasfilms’ Habitat, written in 1990, reads a lot like Rose Stone’s description of Communitree from 1978.

This pattern has happened over and over and over again. Someone built the system, they assumed certain user behaviors. The users came on and exhibited different behaviors. And the people running the system discovered to their horror that the technological and social issues could not in fact be decoupled.

There’s a great document called “LambdaMOO Takes a New Direction,” which is about the wizards of LambdaMOO, Pavel Curtis’s Xerox PARC experiment in building a MUD world. And one day the wizards of LambdaMOO announced “We’ve gotten this system up and running, and all these interesting social effects are happening. Henceforth we wizards will only be involved in technological issues. We’re not going to get involved in any of that social stuff.”

And then, I think about 18 months later — I don’t remember the exact gap of time — they come back. The wizards come back, extremely cranky. And they say: “What we have learned from you whining users is that we can’t do what we said we would do. We cannot separate the technological aspects from the social aspects of running a virtual world.

“So we’re back, and we’re taking wizardly fiat back, and we’re going to do things to run the system. We are effectively setting ourselves up as a government, because this place needs a government, because without us, the place was falling apart.”

People who work on social software are closer in spirit to economists and political scientists than they are to people making compilers. They both look like programming, but when you’re dealing with groups of people as one of your run-time phenomena, that is an incredibly different practice. In the political realm, we would call these kinds of crises a constitutional crisis. It’s what happens when the tension between the individual and the group, and the rights and responsibilities of individuals and groups, gets so serious that something has to be done.

And the worst crisis is the first crisis, because it’s not just “We need to have some rules.” It’s also “We need to have some rules for making some rules.” And this is what we see over and over again in large and long-lived social software systems. Constitutions are a necessary component of large, long-lived, heterogenous groups. 

Geoff Cohen has a great observation about this. He said “The likelihood that any unmoderated group will eventually get into a flame-war about whether or not to have a moderator approaches one as time increases.” As a group commits to its existence as a group, and begins to think that the group is good or important, the chance that they will begin to call for additional structure, in order to defend themselves from themselves, gets very, very high.

Part Two: Why now? 

If these things I’m saying have happened so often before, have been happening and been documented and we’ve got psychological literature that predates the Internet, what’s going on now that makes this important?

I can’t tell you precisely why, but observationally there is a revolution in social software going on. The number of people writing tools to support or enhance group collaboration or communication is astonishing.

The web turned us all into size queens for six or eight years there. It was loosely coupled, it was stateless, it scaled like crazy, and everything became about How big can you get? “How many users does Yahoo have? How many customers does Amazon have? How many readers does MSNBC have?” And the answer could be “Really a lot!” But it could only be really a lot if you didn’t require MSNBC to be answering those readers, and you didn’t require those readers to be talking to one another.

The downside of going for size and scale above all else is that the dense, interconnected pattern that drives group conversation and collaboration isn’t supportable at any large scale. Less is different — small groups of people can engage in kinds of interaction that large groups can’t. And so we blew past that interesting scale of small groups. Larger than a dozen, smaller than a few hundred, where people can actually have these conversational forms that can’t be supported when you’re talking about tens of thousands or millions of users, at least in a single group.

We’ve had things like mailing lists and BBSes for a long time, and more recently we’ve had IM, we’ve had these various patterns. And now, all of a sudden, these things are popping up. We’ve gotten weblogs and wikis, and I think, even more importantly, we’re getting platform stuff. We’re getting RSS. We’re getting shared Flash objects. We’re getting ways to quickly build on top of some infrastructure we can take for granted, that lets us try new things very rapidly.

I was talking to Stewart Butterfield about the chat application they’re trying here. I said “Hey, how’s that going?” He said: “Well, we only had the idea for it two weeks ago. So this is the launch.” When you can go from “Hey, I’ve got an idea” to “Let’s launch this in front of a few hundred serious geeks and see how it works,” that suggests that there’s a platform there that is letting people do some really interesting things really quickly. It’s not that you couldn’t have built a similar application a couple of years ago, but the cost would have been much higher. And when you lower costs, interesting new kinds of things happen.

So the first answer to Why Now? is simply “Because it’s time.” I can’t tell you why it took as long for weblogs to happen as it did, except to say it had absolutely nothing to do with technology. We had every bit of technology we needed to do weblogs the day Mosaic launched the first forms-capable browser. Every single piece of it was right there. Instead, we got Geocities. Why did we get Geocities and not weblogs? We didn’t know what we were doing. 

One was a bad idea, the other turns out to be a really good idea. It took a long time to figure out that people talking to one another, instead of simply uploading badly-scanned photos of their cats, would be a useful pattern. 

We got the weblog pattern in around ’96 with Drudge. We got weblog platforms starting in ’98. The thing really was taking off in 2000. By last year, everyone realized: Omigod, this thing is going mainstream, and it’s going to change everything. 

The vertigo moment for me was when Phil Gyford launched the Pepys weblog, Samuel Pepys’ diaries of the 1660’s turned into a weblog form, with a new post every day from Pepys’ diary. What that said to me was: Phil was asserting, and I now believe, that weblogs will be around for at least 10 years, because that’s how long Pepys kept a diary. And that was this moment of projecting into the future: This is now infrastructure we can take for granted.

Why was there an eight-year gap between a forms-capable browser and the Pepys diaries? I don’t know. It just takes a while for people to get used to these ideas.

So, first of all, this is a revolution in part because it is a revolution. We’ve internalized the ideas and people are now working with them. Second, the things that people are now building are web-native.

When you got social software on the web in the mid-Nineties, a lot of it was: “This is the Giant Lotus Dreadnought, now with New Lightweight Web Interface!” It never felt like the web. It felt like this hulking thing with a little, you know, “Here’s some icons. Don’t look behind the curtain.”

A weblog is web-native. It’s the web all the way in. A wiki is a web-native way of hosting collaboration. It’s lightweight, it’s loosely coupled, it’s easy to extend, it’s easy to break down. And it’s not just the surface, like oh, you can just do things in a form. It assumes http is transport. It assumes markup in the coding. RSS is a web-native way of doing syndication. So we’re taking all of these tools and we’re extending them in a way that lets us build new things really quickly.

Third, in David Weinberger’s felicitous phrase, we can now start to have a Small Pieces Loosely Joined pattern. It’s really worthwhile to look into what Joi Ito is doing with the Emergent Democracy movement, even if you’re not interested in the themes of emerging democracy. This started because a conversation was going on, and Ito said “I am frustrated. I’m sitting here in Japan, and I know all of these people are having these conversations in real-time with one another. I want to have a group conversation, too. I’ll start a conference call.

“But since conference calls are so lousy on their own, I’m going to bring up a chat window at the same time.” And then, in the first meeting, I think it was Pete Kaminski said “Well, I’ve also opened up a wiki, and here’s the URL.” And he posted it in the chat window. And people can start annotating things. People can start adding bookmarks; here are the lists.

So, suddenly you’ve got this meeting, which is going on in three separate modes at the same time, two in real-time and one annotated. So you can have the conference call going on, and you know how conference calls are. Either one or two people dominate it, or everyone’s like “Oh, can I — no, but –“, everyone interrupting and cutting each other off.

It’s very difficult to coordinate a conference call, because people can’t see one another, which makes it hard to manage the interrupt logic. In Joi’s conference call, the interrupt logic got moved to the chat room. People would type “Hand,” and the moderator of the conference call will then type “You’re speaking next,” in the chat. So the conference call flowed incredibly smoothly.

Meanwhile, in the chat, people are annotating what people are saying. “Oh, that reminds me of So-and-so’s work.” Or “You should look at this URL…you should look at that ISBN number.” In a conference call, to read out a URL, you have to spell it out — “No, no, no, it’s w w w dot net dash…” In a chat window, you get it and you can click on it right there. You can say, in the conference call or the chat: “Go over to the wiki and look at this.”

This is a broadband conference call, but it isn’t a giant thing. It’s just three little pieces of software laid next to each other and held together with a little bit of social glue. This is an incredibly powerful pattern. It’s different from: Let’s take the Lotus juggernaut and add a web front-end.

And finally, and this is the thing that I think is the real freakout, is ubiquity. The web has been growing for a long, long time. And so some people had web access, and then lots of people had web access, and then most people had web access.

But something different is happening now. In many situations, all people have access to the network. And “all” is a different kind of amount than “most.” “All” lets you start taking things for granted.

Now, the Internet isn’t everywhere in the world. It isn’t even everywhere in the developed world. But for some groups of people — students, people in high-tech offices, knowledge workers — everyone they work with is online. Everyone they’re friends with is online. Everyone in their family is online.

And this pattern of ubiquity lets you start taking this for granted. Bill Joy once said “My method is to look at something that seems like a good idea and assume it’s true.” We’re starting to see software that simply assumes that all offline groups will have an online component, no matter what.

It is now possible for every grouping, from a Girl Scout troop on up, to have an online component, and for it to be lightweight and easy to manage. And that’s a different kind of thing than the old pattern of “online community.” I have this image of two hula hoops, the old two-hula hoop world, where my real life is over here, and my online life is over there, and there wasn’t much overlap between them. If the hula hoops are swung together, and everyone who’s offline is also online, at least from my point of view, that’s a different kind of pattern.

There’s a second kind of ubiquity, which is the kind we’re enjoying here thanks to Wifi. If you assume whenever a group of people are gathered together, that they can be both face to face and online at the same time, you can start to do different kinds of things. I now don’t run a meeting without either having a chat room or a wiki up and running. Three weeks ago I ran a meeting for the Library of Congress. We had a wiki, set up by Socialtext, to capture a large and very dense amount of technical information on long-term digital preservation.

The people who organized the meeting had never used a wiki before, and now the Library of Congress is talking as if they always had a wiki for their meetings, and are assuming it’s going to be at the next meeting as well — the wiki went from novel to normal in a couple of days.

It really quickly becomes an assumption that a group can do things like “Oh, I took my PowerPoint slides, I showed them, and then I dumped them into the wiki. So now you can get at them.” It becomes a sort of shared repository for group memory. This is new. These kinds of ubiquity, both everyone is online, and everyone who’s in a room can be online together at the same time, can lead to new patterns.

Part Three: What can we take for granted?

If these assumptions are right, one that a group is its own worst enemy, and two, we’re seeing this explosion of social software, what should we do? Is there anything we can say with any certainty about building social software, at least for large and long-lived groups? 

I think there is. A little over 10 years ago, I quit my day job, because Usenet was so interesting, I thought: This is really going to be big. And I actually wrote a book about net culture at the time: Usenet, the Well, Echo, IRC and so forth. It launched in April of ’95, just as that world was being washed away by the web. But it was my original interest, so I’ve been looking at this problem in one way or another for 10 years, and I’ve been looking at it pretty hard for the a year and a half or so.

So there’s this question “What is required to make a large, long-lived online group successful?” and I think I can now answer with some confidence: “It depends.” I’m hoping to flesh that answer out a little bit in the next ten years.

But I can at least say some of the things it depends on. The Calvinists had a doctrine of natural grace and supernatural grace. Natural grace was “You have to do all the right things in the world to get to heaven…” and supernatural grace was “…and God has to anoint you.” And you never knew if you had supernatural grace or not. This was their way of getting around the fact that the Book of Revelations put an upper limit on the number of people who were going to heaven.

Social software is like that. You can find the same piece of code running in many, many environments. And sometimes it works and sometimes it doesn’t. So there is something supernatural about groups being a run-time experience. 

The normal experience of social software is failure. If you go into Yahoo groups and you map out the subscriptions, it is, unsurprisingly, a power law. There’s a small number of highly populated groups, a moderate number of moderately populated groups, and this long, flat tail of failure. And the failure is inevitably more than 50% of the total mailing lists in any category. So it’s not like a cake recipe. There’s nothing you can do to make it come out right every time.

There are, however, I think, about half a dozen things that are broadly true of all the groups I’ve looked at and all the online constitutions I’ve read for software that supports large and long-lived groups. And I’d break that list in half. I’d say, if you are going to create a piece of social software designed to support large groups, you have to accept three things, and design for four things.

Three Things to Accept

1.) Of the things you have to accept, the first is that you cannot completely separate technical and social issues. There are two attractive patterns. One says, we’ll handle technology over `here, we’ll do social issues there. We’ll have separate mailing lists with separate discussion groups, or we’ll have one track here and one track there. This doesn’t work. It’s never been stated more clearly than in the pair of documents called “LambdaMOO Takes a New Direction.” I can do no better than to point you to those documents.

But recently we’ve had this experience where there was a social software discussion list, and someone said “I know, let’s set up a second mailing list for technical issues.” And no one moved from the first list, because no one could fork the conversation between social and technical issues, because the conversation can’t be forked.

The other pattern that’s very, very attractive — anybody who looks at this stuff has the same epiphany, which is: “Omigod, this software is determining what people do!” And that is true, up to a point. But you cannot completely program social issues either. So you can’t separate the two things, and you also can’t specify all social issues in technology. The group is going to assert its rights somehow, and you’re going to get this mix of social and technological effects.

So the group is real. It will exhibit emergent effects. It can’t be ignored, and it can’t be programmed, which means you have an ongoing issue. And the best pattern, or at least the pattern that’s worked the most often, is to put into the hands of the group itself the responsibility for defining what value is, and defending that value, rather than trying to ascribe those things in the software upfront.

2.) The second thing you have to accept: Members are different than users. A pattern will arise in which there is some group of users that cares more than average about the integrity and success of the group as a whole. And that becomes your core group, Art Kleiner’s phrase for “the group within the group that matters most.” 

The core group on Communitree was undifferentiated from the group of random users that came in. They were separate in their own minds, because they knew what they wanted to do, but they couldn’t defend themselves against the other users. But in all successful online communities that I’ve looked at, a core group arises that cares about and gardens effectively. Gardens the environment, to keep it growing, to keep it healthy.

Now, the software does not always allow the core group to express itself, which is why I say you have to accept this. Because if the software doesn’t allow the core group to express itself, it will invent new ways of doing so. 

On alt.folklore.urban , the discussion group about urban folklore on Usenet, there was a group of people who hung out there and got to be friends. And they came to care about the existence of AFU, to the point where, because Usenet made no distinction between members in good standing and drive-by users, they set up a mailing list called The Old Hats. The mailing list was for meta-discussion, discussion about AFU, so they could coordinate efforts formally if they were going to troll someone or flame someone or ignore someone, on the mailing list.Addendum, July 2, 2003: A longtime a.f.u participant says that the Old Hat list was created to allow the Silicon Valley-dwelling members to plan a barbecue, so that they could add a face-to-face dimension to their virtual interaction. The use of the list as a backstage area for discussing the public newsgroup arose after the fact.

Then, as Usenet kept growing, many newcomers came along and seemed to like the environment, because it was well-run. In order to defend themselves from the scaling issues that come from of adding a lot of new members to the Old Hats list, they said “We’re starting a second list, called the Young Hats.”

So they created this three-tier system, not dissimilar to the tiers of anonymous cowards, logged-in users, and people with high karma on Slashdot. But because Usenet didn’t let them do it in the software, they brought in other pieces of software, these mailing lists, that they needed to build the structure. So you don’t get the program users, the members in good standing will find one another and be recognized to one another.

3.) The third thing you need to accept: The core group has rights that trump individual rights in some situations. This pulls against the libertarian view that’s quite common on the network, and it absolutely pulls against the one person/one vote notion. But you can see examples of how bad an idea voting is when citizenship is the same as ability to log in. 

In the early Nineties, a proposal went out to create a Usenet news group for discussing Tibetan culture, called soc.culture.tibet. And it was voted down, in large part because a number of Chinese students who had Internet access voted it down, on the logic that Tibet wasn’t a country; it was a region of China. And in their view, since Tibet wasn’t a country, there oughtn’t be any place to discuss its culture, because that was oxymoronic. 

Now, everyone could see that this was the wrong answer. The people who wanted a place to discuss Tibetan culture should have it. That was the core group. But because the one person/one vote model on Usenet said “Anyone who’s on Usenet gets to vote on any group,” sufficiently contentious groups could simply be voted away. 

Imagine today if, in the United States, Internet users had to be polled before any anti-war group could be created. Or French users had to be polled before any pro-war group could be created. The people who want to have those discussions are the people who matter. And absolute citizenship, with the idea that if you can log in, you are a citizen, is a harmful pattern, because it is the tyranny of the majority. 

So the core group needs ways to defend itself — both in getting started and because of the effects I talked about earlier — the core group needs to defend itself so that it can stay on its sophisticated goals and away from its basic instincts. 

The Wikipedia has a similar system today, with a volunteer fire department, a group of people who care to an unusual degree about the success of the Wikipedia. And they have enough leverage, because of the way wikis work, they can always roll back graffiti and so forth, that that thing has stayed up despite repeated attacks. So leveraging the core group is a really powerful system.

Now, when I say these are three things you have to accept, I mean you have to accept them. Because if you don’t accept them upfront, they’ll happen to you anyway. And then you’ll end up writing one of those documents that says “Oh, we launched this and we tried it, and then the users came along and did all these weird things. And now we’re documenting it so future ages won’t make this mistake.” Even though you didn’t read the thing that was written in 1978.

All groups of any integrity have a constitution. The constitution is always partly formal and partly informal. At the very least, the formal part is what’s substantiated in code — “the software works this way.” 

The informal part is the sense of “how we do it around here.” And no matter how is substantiated in code or written in charter, whatever, there will always be an informal part as well. You can’t separate the two.

Four Things to Design For

1.) If you were going to build a piece of social software to support large and long-lived groups, what would you design for? The first thing you would design for is handles the user can invest in. 

Now, I say “handles,” because I don’t want to say “identity,” because identity has suddenly become one of those ideas where, when you pull on the little thread you want, this big bag of stuff comes along with it. Identity is such a hot-button issue now, but for the lightweight stuff required for social software, its really just a handle that matters. 

It’s pretty widely understood that anonymity doesn’t work well in group settings, because “who said what when” is the minimum requirement for having a conversation. What’s less well understood is that weak pseudonymity doesn’t work well, either. Because I need to associate who’s saying something to me now with previous conversations. 

The world’s best reputation management system is right here, in the brain. And actually, it’s right here, in the back, in the emotional part of the brain. Almost all the work being done on reputation systems today is either trivial or useless or both, because reputations aren’t linearizable, and they’re not portable. 

There are people who cheat on their spouse but not at cards, and vice versa, and both and neither. Reputation is not necessarily portable from one situation to another, and it’s not easily expressed. 

eBay has done us all an enormous disservice, because eBay works in non-iterated atomic transactions, which are the opposite of social situations. eBay’s reputation system works incredibly well, because it starts with a linearizable transaction — “How much money for how many Smurfs?” — and turns that into a metric that’s equally linear. 

That doesn’t work well in social situations. If you want a good reputation system, just let me remember who you are. And if you do me a favor, I’ll remember it. And I won’t store it in the front of my brain, I’ll store it here, in the back. I’ll just get a good feeling next time I get email from you; I won’t even remember why. And if you do me a disservice and I get email from you, my temples will start to throb, and I won’t even remember why. If you give users a way of remembering one another, reputation will happen, and that requires nothing more than simple and somewhat persistent handles. 

Users have to be able to identify themselves and there has to be a penalty for switching handles. The penalty for switching doesn’t have to be total. But if I change my handle on the system, I have to lose some kind of reputation or some kind of context. This keeps the system functioning.

Now, this pulls against the sense that we’ve had since the early psychological writings about the Internet. “Oh, on the Internet we’re all going to be changing identities and genders like we change our socks.” 

And you see things like the Kaycee Nicole story, where a woman in Kansas pretended to be a high school student, and then because the invented high school student’s friends got so emotionally involved, she then tried to kill the Kaycee Nicole persona off. “Oh, she’s got cancer and she’s dying and it’s all very tragic.” And of course, everyone wanted to fly to meet her. So then she sort of panicked and vanished. And a bunch of places on the Internet, particularly the MetaFilter community, rose up to find out what was going on, and uncovered the hoax. It was sort of a distributed detective movement.

Now a number of people point to this and say “See, I told you about that identity thing!” But the Kaycee Nicole story is this: changing your identity is really weird. And when the community understands that you’ve been doing it and you’re faking, that is seen as a huge and violent transgression. And they will expend an astonishing amount of energy to find you and punish you. So identity is much less slippery than the early literature would lead us to believe. 

2.) Second, you have to design a way for there to be members in good standing. Have to design some way in which good works get recognized. The minimal way is, posts appear with identity. You can do more sophisticated things like having formal karma or “member since.” 

I’m on the fence about whether or not this is a design or accepting. Because in a way I think members in good standing will rise. But more and more of the systems I’m seeing launching these days are having some kind of additional accretion so you can tell how much involvement members have with the system. 

There’s an interesting pattern I’m seeing among the music-sharing group that operates between Tokyo and Hong Kong. They operate on a mailing list, which they set up for themselves. But when they’re trading music, what they’re doing is, they’re FedExing one another 180-gig hard-drives. So you’re getting .wav files and not MP3s, and you’re getting them in bulk. 

Now, you can imagine that such a system might be a target for organizations that would frown on this activity. So when you join that group, your user name is appended with the user name of the person who is your sponsor. You can’t get in without your name being linked to someone else. You can see immediately the reputational effects going on there, just from linking two handles. 

So in that system, you become a member in good standing when your sponsor link goes away and you’re there on your own report. If, on the other hand, you defect, not only are you booted, but your sponsor is booted. There are lots and lots of lightweight ways to accept and work with the idea of member in good standing. 

3.) Three, you need barriers to participation. This is one of the things that killed Usenet. You have to have some cost to either join or participate, if not at the lowest level, then at higher levels. There needs to be some kind of segmentation of capabilities. 

Now, the segmentation can be total — you’re in or you’re out, as with the music group I just listed. Or it can be partial — anyone can read Slashdot, anonymous cowards can post, non-anonymous cowards can post with a higher rating. But to moderate, you really have to have been around for a while. 

It has to be hard to do at least some things on the system for some users, or the core group will not have the tools that they need to defend themselves. 

Now, this pulls against the cardinal virtue of ease of use. But ease of use is wrong. Ease of use is the wrong way to look at the situation, because you’ve got the Necker cube flipped in the wrong direction. The user of social software is the group, not the individual.

I think we’ve all been to meetings where everyone had a really good time, we’re all talking to one another and telling jokes and laughing, and it was a great meeting, except we got nothing done. Everyone was amusing themselves so much that the group’s goal was defeated by the individual interventions. 

The user of social software is the group, and ease of use should be for the group. If the ease of use is only calculated from the user’s point of view, it will be difficult to defend the group from the group is its own worst enemy style attacks from within. 

4.) And, finally, you have to find a way to spare the group from scale. Scale alone kills conversations, because conversations require dense two-way conversations. In conversational contexts, Metcalfe’s law is a drag. The fact that the amount of two-way connections you have to support goes up with the square of the users means that the density of conversation falls off very fast as the system scales even a little bit. You have to have some way to let users hang onto the less is more pattern, in order to keep associated with one another. 

This is an inverse value to scale question. Think about your Rolodex. A thousand contacts, maybe 150 people you can call friends, 30 people you can call close friends, two or three people you’d donate a kidney to. The value is inverse to the size of the group. And you have to find some way to protect the group within the context of those effects. 

Sometimes you can do soft forking. Live Journal does the best soft forking of any software I’ve ever seen, where the concepts of “you” and “your group” are pretty much intertwingled. The average size of a Live Journal group is about a dozen people. And the median size is around five. 

But each user is a little bit connected to other such clusters, through their friends, and so while the clusters are real, they’re not completely bounded — there’s a soft overlap which means that though most users participate in small groups, most of the half-million LiveJournal users are connected to one another through some short chain. 

IRC channels and mailing lists are self-moderating with scale, because as the signal to noise ratio gets worse, people start to drop off, until it gets better, so people join, and so it gets worse. You get these sort of oscillating patterns. But it’s self-correcting.

And then my favorite pattern is from MetaFilter, which is: When we start seeing effects of scale, we shut off the new user page. “Someone mentions us in the press and how great we are? Bye!” That’s a way of raising the bar, that’s creating a threshold of participation. And anyone who bookmarks that page and says “You know, I really want to be in there; maybe I’ll go back later,” that’s the kind of user MeFi wants to have. 

You have to find some way to protect your own users from scale. This doesn’t mean the scale of the whole system can’t grow. But you can’t try to make the system large by taking individual conversations and blowing them up like a balloon; human interaction, many to many interaction, doesn’t blow up like a balloon. It either dissipates, or turns into broadcast, or collapses. So plan for dealing with scale in advance, because it’s going to happen anyway.

Conclusion

Now, those four things are of course necessary but not sufficient conditions. I propose them more as a platform for building the interesting differences off. There are lots and lots and lots of other effects that make different bits of software interesting enough that you would want to keep more than one kind of pattern around. But those are commonalities I’m seeing across a range of social software for large and long-lived groups. 

In addition, you can do all sorts of things with explicit clustering, whether it’s guilds in massively multi-player games, or communities on Live Journal or what have you. You can do things with conversational artifacts, where the group participation leaves behind some record. The Wikipedia right now, the group collaborated online encyclopedia is the most interesting conversational artifact I know of, where product is a result of process. Rather than “We’re specifically going to get together and create this presentation” it’s just “What’s left is a record of what we said.” 

There are all these things, and of course they differ platform to platform. But there is this, I believe, common core of things that will happen whether you plan for them or not, and things you should plan for, that I think are invariant across large communal software. 

Writing social software is hard. And, as I said, the act of writing social software is more like the work of an economist or a political scientist. And the act of hosting social software, the relationship of someone who hosts it is more like a relationship of landlords to tenants than owners to boxes in a warehouse. 

The people using your software, even if you own it and pay for it, have rights and will behave as if they have rights. And if you abrogate those rights, you’ll hear about it very quickly.

That’s part of the problem that the John Hegel theory of community — community leads to content, which leads to commerce — never worked. Because lo and behold, no matter who came onto the Clairol chat boards, they sometimes wanted to talk about things that weren’t Clairol products. 

“But we paid for this! This is the Clairol site!” Doesn’t matter. The users are there for one another. They may be there on hardware and software paid for by you, but the users are there for one another. 

The patterns here, I am suggesting, both the things to accept and the things to design for, are givens. Assume these as a kind of social platform, and then you can start going out and building on top of that the interesting stuff that I think is going to be the real result of this period of experimentation with social software. 

Thank you very much.

The FCC, Weblogs, and Inequality

First published June 3, 2003 on the “Networks, Economics, and Culture” mailing list.

Yesterday, the FCC adjusted the restrictions on media ownership, allowing newspapers to own TV stations, and raising the ownership limitations on broadcast TV networks by 10%, to 45% from 35%. It’s not clear whether the effects of the ruling will be catastrophic or relatively unimportant, and there are smart people on both sides of that question. It is also unclear what effect the internet had on the FCC’s ruling, or what role it will play now.

What is clear, however, is a lesson from the weblog world: inequality is a natural component of media. For people arguing about an ideal media landscape, the tradeoffs are clear: Diverse. Free. Equal. Pick two.

The Developing Debate

The debate about media and audience size used to be focussed on the low total number of outlets, mainly because there were only three national television networks. Now that more than 80% of the country gets their television from cable and satellite, the concern is concentration. In this view, there may be diverse voices available on the hundred or more TV channels the average viewer gets, but the value of that diversity is undone by the fact that large media firms enjoy the lion’s share of the audience’s cumulative attention.

A core assumption in this debate is that if media were free of manipulation, the audience would be more equally distributed, so the concentration of a large number of viewers by a small number of outlets is itself evidence of impermissible control. In this view, government intervention is required simply to restore the balance we would expect in an unmanipulated system.

For most of the 20th century, we had no way of testing this proposition. The media we had were so heavily regulated and the outlets so scarce that we had no other scenarios to examine, and the growth of cable in the last 20 years involved local monopoly of the wire into the home, so it didn’t provide a clean test of an alternative.

Weblogs As Media Experiment

In the last few years, however, we have had a clean test, and it’s weblogs. Weblogs are the freest media the world has ever known. Within the universe of internet users, the costs of setting up a weblog are minor, and perhaps more importantly, require no financial investment, only time, thus greatly weakening the “freedom of the press for those who can afford one” effect. Furthermore, there is no Weblog Central — you do not need to incorporate your weblog, you do not need to register your weblog, you do not need to clear your posts with anyone. Weblogs are the best attempt we’ve seen to date of making freedom of speech and freedom of the press the same freedom, in Mike Godwin’s famous phrase.

And in this free, decentralized, diverse, and popular medium we find astonishing inequality, inequality so extreme it makes the distribution of television ratings look positively egalitarian. In fact, a review of any of the weblog tracking initiatives such as Technorati or the blogging ecosystem project shows thousand-fold imbalances between the most popular and average weblogs. These inequalities often fall into what’s known as a power law distribution, a curve where a tiny number of sites account for a majority of the in-bound links, while the vast majority of sites have a very small number of such links. (Although the correlation with links and traffic is not perfect, it is a strong proxy for audience size.)

The reasons for this are complex (I addressed some of them in Power Laws, Weblogs, and Inequality), but from the point of view of analyzing the FCC ruling, the lesson of weblog popularity is clear: inequality can arise in systems where users are free to make choices among a large set of options, even in the absence of central control or manipulation. Inequality is not a priori evidence of manipulation, in other words; it can also be a side effect of large systems governed by popular choice.

In the aftermath of the FCC ruling, and given what we have learned from the development of weblogs, the debate on media concentration can now be sharpened to a single question: if inequality is a fact of life, even in diverse and free systems, what should our reaction be?

‘Pick Two’ Yields Three Positions

There are three coherent positions in this debate: The first is advocacy of free and equal media, which requires strong upper limits on overall diversity. This was roughly the situation of the US broadcast television industry from 1950-1980. Any viewer was free to watch shows from any network, but having only three national networks kept any one of them from becoming dominant. (Funnily enough, Gunsmoke, the most popular television show in history, enjoyed a 45% audience share, the same upper limit now proposed by the FCC for overall audience size.)

Though this position is logically coherent, the unprecedented explosion of media choice makes it untenable in practice. Strong limits on the number of media outlets accessible by any given member of the public now exist in only two places: broadcast radio and newspapers, not coincidently, the two media least affected by new technologies of distribution. 

The second coherent position is advocacy of diverse and equal media, which requires constraints on freedom. This view is the media equivalent of redistributive taxation, where an imbalance in audience size is seen as being so corrosive of democratic values that steps must be taken to limit the upper reach of popular media outlets, and to subsidize in some way less popular ones. In practice, this position is advocacy of diverse and less unequal media. This is the position taken by the FCC, who yesterday altered regulations rather than removing them. (There can obviously be strong disagreement within this group about the kind and degree of regulations.)

People who hold this view believe that regulation is preferable to inequality, and will advocate governmental intervention in any market where the scarcity in the number of channels constrains number of outlets the locals have access to (again, radio and newspapers are the media with the most extreme current constraints.)

More problematic for people who hold this view are unequal but unconstrained media such as weblogs. As weblogs grow in importance, we can expect at least some members of the “diverse and equal” camp to advocate regulation of weblogs, on the grounds that the imbalance between Glenn Reynolds of InstaPundit.com and J. Random Blogger is no different than the imbalance between Clear Channel and WFMU. This fight will pit those who advocate government intervention only where there is scarcity (whether regulatory or real) vs. those who advocate regulation wherever there is inequality, even if it arises naturally and in an unconstrained system.

The third coherent position is advocacy of diverse and free media, which requires abandonment of equality as a goal. For this camp, the removal of regulation is desirable in and of itself, whatever the outcome. Given the evidence that diverse and free systems migrate to unequal distributions, the fact of inequality is a necessarily acceptable outcome to this group. However, in truly diverse systems, with millions of choices rather than hundreds, the imbalance between popular and average media outlets is tempered by the imbalance between the most popular outlets and the size of the system as a whole. As popular as Glenn Reynolds may be, InstaPundit is no Gunsmoke; no one weblog is going to reach 45% of the audience. In large diverse systems, freedom increases the inequality between outlets, but the overall size and growth weakens the effects of concentration.

This view is the least tested in practice. While the “diverse and equal” camp is advocating regulation and therefore an articulation of the status quo, people who believe that our goals should be diversity and freedom and damn the consequences haven’t had much effect on the traditional media landscape to date, so we have very little evidence on the practical effect of their proposals. The most obvious goal for this group is radical expansion of media choice in all dimensions, and a subsequent dropping of all mandated restrictions. For this view to come to pass, restrictions on internet broadcast of radio and TV should be dropped, web radio stations must live in the same copyright regime broadcast stations do, much more unlicensed spectrum must be made available, and so on.

And this the big risk. Though the FCC’s ruling is portrayed as deregulation, it is nothing of the sort. It is simply different regulation, and it adjusts percentages within a system of scarcity, rather than undoing the scarcity itself. It remains to be seen if the people supporting the FCC’s current action are willing to go all the way to the weblogization of everything, but this is what will be required to get to the benefits of the free and diverse scenario. In the absence of regulation, the only defense against monopolization is to create a world where, no matter how many media outlets a single company can buy, more can appear tomorrow. The alternative — reduction of regulation without radical expansion — is potentially the worst of both worlds.

The one incoherent view is the belief that a free and diverse media will naturally tend towards equality. The development of weblogs in their first five years demonstrates that is not always true, and gives us reason to suspect it may never be true. Equality can only be guaranteed by limiting either diversity or freedom.

The best thing that could come from the lesson of weblog popularity would be an abandoning of the idea that there will ever be an unconstrained but egalitarian media utopia, a realization ideally followed by a more pragmatic discussion between the “diverse and free” and “diverse and equal” camps.

Grid Supercomputing: The Next Push

First published May 20, 2003 on the “Networks, Economics, and Culture” mailing list.

Grid Computing is, according to the Grid Information Centre a way to “…enable the sharing, selection, and aggregation of a wide variety of geographically distributed computational resources.” It is, in other words, an attempt to make Sun’s famous pronouncement “The Network Is The Computer” an even more workable proposition. (It is also an instantiation of several of the patterns of decentralization that used to travel together under the name peer-to-peer.)

Despite the potential generality of the Grid, most of the public pronouncements are focusing on the use of Grids for supercomputing. IBM defines it more narrowly: Grid Computing is “… applying resources from many computers in a network-at the same time-to a single problem” , and the MIT Technology Review equated Grid technology with supercomputing on tap when it named Grids one of “Ten Technologies That Will Change the World.”

This view is wrong. Supercomputing on tap won’t live up to to this change-the-world billing, because computation isn’t a terribly important part of what people do with computers. This is a lesson we learned with PCs, and it looks like we will be relearning it with Grids.

The Misnomer of the Personal Computer

Though most computational power lives on the world’s hundreds of millions of PCs, most PCs are not used for computation most of the time. There are two reasons for this, both of which are bad news for predictions of a supercomputing revolution. The first is simply that most people are not sitting at their computer for most hours of the day. The second is because even when users are at their computers, they are not tackling computationally hard problems, and especially not ones that require batch processing — submit question today, get answer tomorrow (or next week.) Indeed, whenever users encounter anything that feels even marginally like batch processing — a spreadsheet that takes seconds to sort, a Photoshop file that takes a minute to render — they begin hankering for a new PC, because they care about peak performance, not total number of cycles available over time. The only time the average PC performs any challenging calculations is rendering the visual for The Sims or WarCraft.

Therein lies the conundrum of the Grid-as-supercomputer: the oversupply of cycles the Grid relies on exists because of a lack of demand. PCs are used as many things — file cabinets and communications terminals and typewriters and photo albums and jukeboxes — before they are used as literal computers. If most users had batch applications they were willing to wait for even as long as overnight, the first place they would look for spare cycles would be on their own machines, not on some remote distributed supercomputer. Simply running their own PC round the clock would offer a 10x to 20x improvement, using hardware they already own.

If users needed Grid-like power, the Grid itself wouldn’t work, because the unused cycles the Grid is going to aggregate wouldn’t exist. Of all the patterns supported by decentralization, from file-sharing to real-time collaboration to supercomputing, supercomputing is the least general.

The Parallel with Push 

There is a parallel between Grids and Push technology, that glamorous flameout of the mid-90s. The idea behind Push, exemplified by the data-displaying screensaver Pointcast, was that because users suffered from limited bandwidth and periodic disconnection (e.g. laptops on airplanes), they would sign up to have data pushed to them, which they could then experience at their leisure. This, we were told, would create a revolution in the way people use the internet. (This notion reached its apotheosis in a Wired magazine cover story, “Push!”, whose subtitle read “Kiss your browser goodbye: The radical future of media beyond the Web”.)

As it turned out, user’s response to poor connectivity was to agitate for better connectivity, because like CPUs, users want bandwidth that provides good peak performance, even if that means most of it gets “wasted.” Shortly after the Wired cover, it was PointCast we kissed goodbye.

Push’s collapse was made all the more spectacular because of its name. The label Push seemed to suggest a sweeping new pattern of great importance. Had the technology been given a duller but more descriptive name, like “forward caching,” it would have generated much less interest in the beginning, but might also not have been so prematurely consigned to the list of failed technologies.

Forward caching is in fact a key part of some applications. In particular, companies building decentralized groupware like Groove , Kubi Software, and Shinkuro, all of whom use forward caching of shared files to overcome the difficulties caused by limited bandwidth and partially disconnected nodes, just the issues Push was supposed to address. By pushing the name Push, the Pointcast’s of the world made it harder to see that though forward caching was not universally important, it was still valuable in some areas.

Distributed Batch Processing

So it is with Grids. The evocative name suggests that computation is so critical that we must have a global infrastructure to provide all those cycles we’ll be needing next time our boss asks us to model an earthquake, or we have to help our parents crack a cryptographic key. The broadness of the term masks the specialised nature of the technology, which should probably be called “distributed batch processing.”

Like forward caching, distributed batch processing is useful in a handful of areas. The SETI@Home project runs on distributed batch processing, as does the distributed.net cryptographic key-breaking tool. The sequencing of the SARS virus happened using distributed batch processing. Distributed batch processing could be useful in fields like game theory, where scenarios could be exhaustively tested on the cheap, or animated film, where small studios or even individuals could afford acces to Pixar-like render farms.

Distributed batch processing is real progress for people who need supercomputing power, but having supercomputing on tap doesn’t make you a researcher anymore than having surfboard wax on tap would make you a surfer. Indeed, to the consternation of chip manufacturers (and the delight of researchers who want cheap cycles), people don’t even have much real use for the computational power on the machines they buy today.

History has not been kind to business predictions based on an undersupply of cycles, and the business case for selling access to supercomputing on tap is grim. Assuming that a $750 machine with a 2 gigahertz chip can be used for 3 years, commodity compute time now costs roughly a penny a gigahertz/hour. If Grid access costs more than a penny a ghz/hr, building a dedicated supercomputer starts to be an economical proposition, relative to buying cycles from a Grid. (And of course Moore’s Law sees to it that these economics get more adverse every year.)

Most of the for-profit work on supercomputing Grids will be in helping businesses harness their employees’ PCs so that the CFO can close the books quickly — cheap, one-shot contracts, in other words, that mostly displace money from the purchase of new servers. The cost savings for the average business will be nice of course, but saving money by deferring server purchases is hardly a revolution.

People Matter More Than Machines

We have historically overestimated the value of connecting machines to one another, and underestimated the value of connecting people, and by emphasizing supercomputing on tap, the proponents of Grids are making that classic mistake anew. During the last great age of batch processing, the ARPAnet’s designers imagined that the nascent network would be useful as a way of providing researchers access to batch processing at remote locations. This was wrong, for two reasons: first, it turned out researchers were far more interested in getting their own institutions to buy computers they could use locally than in using remote batch processing, and Moore’s Law made that possible as time passed. Next, once email was ported to the network, it became a far more important part of the ARPAnet backbone than batch processing was. Then as now, access to computing power mattered less to the average network user than access to one another.

Though Sun was incredibly prescient in declaring “The Network is the Computer” at a time when PCs didn’t even ship with built-in modems, the phrase is false in some important ways — a network is a different kind of thing than a computer. As long ago as 1968, J.R. Licklider predicted that computers would one day be more important as devices of communication than of computation, a prediction that came true when email overtook the spreadsheet as the core application driving PC purchases.

What was true of the individual PC is true of the network as well — changes in computational power are nice, but changes in communications power are profound. As we learned with Push, an intriguing name is no substitute for general usefulness. Networks are most important as ways of linking unevenly distributed resources — I know something you don’t know; you have something I don’t have — and Grid technology will achieve general importance to the degree that it supports those kinds of patterns. The network applications that let us communicate and share in heterogeneous environments, from email to Kazaa, are far more important uses of the network than making all the underlying computers behave as a single supercomputer.

Permanet, Nearlynet, and Wireless Data

First published March 28, 2003 on the “Networks, Economics, and Culture” mailing list. 

“The future always comes too fast and in the wrong order.” — Alvin Toffler

For most of the past year, on many US airlines, those phones inserted into the middle seat have borne a label reading “Service Disconnected.” Those labels tell a simple story — people don’t like to make $40 phone calls. They tell a more complicated one as well, about the economics of connectivity and about two competing visions for access to our various networks. One of these visions is the one everyone wants — ubiquitous and convenient — and the other vision is the one we get — spotty and cobbled together. 

Call the first network “perma-net,” a world where connectivity is like air, where anyone can send or receive data anytime anywhere. Call the second network “nearly-net”, an archipelago of connectivity in an ocean of disconnection. Everyone wants permanet — the providers want to provide it, the customers want to use it, and every few years, someone announces that they are going to build some version of it. The lesson of in-flight phones is that nearlynet is better aligned with the technological, economic, and social forces that help networks actually get built. The most illustrative failure of permanet is the airphone. The most spectacular was Iridium. The most expensive will be 3G. 

“I’m (Not) Calling From 35,000 Feet”

The airphone business model was obvious — the business traveler needs to stay in contact with the home office, with the next meeting, with the potential customer. When 5 hours of the day disappears on a flight, value is lost, and business customers, the airlines reasoned, would pay a premium to recapture that value.

The airlines knew, of course, that the required investment would make in-flight calls expensive at first, but they had two forces on their side. The first was a captive audience — when a plane was in the air, they had a monopoly on communication with the outside world. The second was that, as use increased, they would pay off the initial investment, and could start lowering the cost of making a call, further increasing use.

What they hadn’t factored in was the zone of connectivity between the runway and the gate, where potential airphone users were physically captive, but where their cell phones still worked. The time spent between the gate and the runway can account for a fifth of even long domestic flights, and since that is when flight delays tend to appear, it is a disproportionately valuable time in which to make calls.

This was their first miscalculation. The other was that they didn’t know that competitive pressures in the cell phone market would drive the price of cellular service down so fast that the airphone would become more expensive, in relative terms, after it launched. 

The negative feedback loop created by this pair of miscalculations marginalized the airphone business. Since price displaces usage, every increase in the availability on cell phones or reduction in the cost of a cellular call meant that some potential users of the airphone would opt out. As users opted out, the projected revenues shrank. This in turn postponed the date at which the original investment in the airphone system could be paid back. The delay in paying back the investment delayed the date at which the cost of a call could be reduced, making the airphone an even less attractive offer as the number of cell phones increased and prices shrank still further.

66 Tears

This is the general pattern of the defeat of permanet by nearlynet. In the context of any given system, permanet is the pattern that makes communication ubiquitous. For a plane ride, the airphone is permanet, always available but always expensive, while the cell phone is nearlynet, only intermittently connected but cheap and under the user’s control. 

The characteristics of the permanet scenario — big upfront investment by few enough companies that they get something like monopoly pricing power — is usually justified by the assumption that users will accept nothing less than total connectivity, and will pay a significant premium to get it. This may be true in scenarios where there is no alternative, but in scenarios where users can displace even some use from high- to low-priced communications tools, they will.

This marginal displacement matters because a permanet network doesn’t have to be unused to fail. It simply has to be underused enough to be unprofitable. Builders of large networks typically overestimate the degree to which high cost deflects use, and underestimate the number of alternatives users have in the ways they communicate. And in the really long haul, the inability to pay off the initial investment in a timely fashion stifles later investment in upgrading the network.

This was the pattern of Iridium, Motorola’s famously disastrous network of 66 satellites that would allow the owner of an Iridium phone to make a phone call from literally anywhere in the world. This was permanet on a global scale. Building and launching the satellites cost billions of dollars, the handsets cost hundreds, the service cost dollars a minute, all so the busy executive could make a call from the veldt.

Unfortunately, busy executives don’t work in the veldt. They work in Pasedena, or Manchester, or Caracas. This is the SUV pattern — most SUV ads feature empty mountain roads but most actual SUVs are stuck in traffic. Iridium was a bet on a single phone that could be used anywhere, but its high cost eroded any reason to use an Iridium phone in most of the perfectly prosaic places phone calls actually get made.

3G: Going, Going, Gone

The biggest and most expensive permanet effort right now is wireless data services, principally 3G, the so-called 3rd generation wireless service, and GPRS, the General Packet Radio Service (though the two services are frequently lumped together under the 3G label.) 3G data services provide always on connections and much higher data rates to mobile devices than the widely deployed GSM networks do, and the wireless carriers have spent tens of billions worldwide to own and operate such services. Because 3G requires licensed spectrum, the artificial scarcity created by treating the airwaves like physical property guarantees limited competition among 3G providers. 

The idea here is that users want to be able to access data any time anywhere. This is of course true in the abstract, but there are two caveats: the first is that they do not want it at any cost, and the second and more worrying one is that if they won’t use 3G in environments where they have other ways of connecting more cheaply.

The nearlynet to 3G’s permanet is Wifi (and, to a lesser extent, flat-rate priced services like email on the Blackberry.) 3G partisans will tell you that there is no competition between 3G and Wifi, because the services do different things, but of course that is exactly the problem. If they did the same thing, the costs and use patterns would also be similar. It’s precisely the ways in which Wifi differs from 3G that makes it so damaging. 

The 3G model is based on two permanetish assumptions — one, that users have an unlimited demand for data while traveling, and two, that once they get used to using data on their phone, they will use it everywhere. Both assumptions are wrong.

First, users don’t have an unlimited demand for data while traveling, just as they didn’t have an unlimited demand for talking on the phone while flying. While the mobile industry has been telling us for years that internet-accessible cellphones will soon outnumber PCs, they fail to note that for internet use, measured in either hours or megabytes, the PC dwarfs the phone as a tool. Furthermore, in the cases where users do demonstrate high demand for mobile data services by getting 3G cards for their laptops, the network operators have been forced to raise their prices, the opposite of the strategy that would drive use. Charging more for laptop use makes 3G worse relative to Wifi, whose prices are constantly falling (access points and Wifi cards are now both around $60.)

The second problem is that 3G services don’t just have the wrong prices, they have the wrong kind of prices — metered — while Wifi is flat-rate. Metered data gives the user an incentive to wait out the cab ride or commute and save their data intensive applications for home or office, where sending or receiving large files creates no additional cost. The more data intensive a users needs are, the greater the price advantage of Wifi, and the greater their incentive to buy Wifi equipment. At current prices, a user can buy a Wifi access point for the cost of receiving a few PDF files over a 3G network, and the access point, once paid for, will allow for unlimited use at much higher speeds. 

The Vicious Circle 

In airline terms, 3G is like the airphone, an expensive bet that users in transit, captive to their 3G provider, will be happy to pay a premium for data communications. Wifi is like the cell phone, only useful at either end of travel, but providing better connectivity at a fraction of the price. This matches the negative feedback loop of the airphone — the cheaper Wifi gets, both in real dollars and in comparison to 3G, the greater the displacement away from 3G, the longer it will take to pay back the hardware investment (and, in countries that auctioned 3G licenses, the stupefying purchase price), and the later the day the operators can lower their prices.

More worryingly for the operators, the hardware manufacturers are only now starting to toy with Wifi in mobile devices. While the picture phone is a huge success as a data capture device, the most common use is “Take picture. Show friends. Delete.” Only a fraction of the photos that are taken are sent over 3G now, and if the device manufacturers start making either digital cameras or picture phones with Wifi, the willingness to save a picture for free upload later will increase. 

Not all permanets end in total failure, of course. Unlike Iridium, 3G is seeing some use, and that use will grow. The displacement of use to cheaper means of connecting, however, means that 3G will not grow as fast as predicted, raising the risk of being too little used to be profitable.

Partial Results from Partial Implementation

In any given situation, the builders of permanet and nearlynet both intend to give the customers what they want, but since what customers want is good cheap service, it is usually impossible to get there right away. Permanet and nearlynet are alternate strategies for evolving over time.

The permanet strategy is to start with a service that is good but expensive, and to make it cheaper. The nearlynet strategy is to start with a service that is lousy but cheap, and to make it better. The permanet strategy assumes that quality is the key driver of a new service, and permanet has the advantage of being good at every iteration. Nearlynet assumes that cheapness is the essential characteristic, and that users will forgo quality for a sufficient break in price.

What the permanet people have going for them is that good vs. lousy is not a hard choice to make, and if things stayed that way, permanet would win every time. What they have going against them, however, is incentive. The operator of a cheap but lousy service has more incentive to improve quality than the operator of a good but expensive service does to cut prices. And incremental improvements to quality can produce disproportionate returns on investment when a cheap but lousy service becomes cheap but adequate. The good enough is the enemy of the good, giving an edge over time to systems that produce partial results when partially implemented. 

Permanet is as Permanet Does

The reason the nearlynet strategy is so effective is that coverage over cost is often an exponential curve — as the coverage you want rises, the cost rises far faster. It’s easier to connect homes and offices than roads and streets, easier to connect cities than suburbs, suburbs than rural areas, and so forth. Thus permanet as a technological condition is tough to get to, since it involves biting off a whole problem at once. Permanet as a personal condition, however, is a different story. From the user’s point of view, a kind of permanet exists when they can get to the internet whenever they like.

For many people in the laptop tribe, permanet is almost a reality now, with home and office wired, and any hotel or conference they attend Wifi- or ethernet-enabled, at speeds that far outstrip 3G. And since these are the people who reliably adopt new technology first, their ability to send a spreadsheet or receive a web page faster and at no incremental cost erodes the early use the 3G operators imagined building their data services on. 

In fact, for many business people who are the logical customers for 3G data services, there is only one environment where there is significant long-term disconnection from the network: on an airplane. As with the airphone itself, the sky may be a connection-poor environment for some time to come, not because it isn’t possible to connect it, but because the environment on the plane isn’t nearly nearlynet enough, which is to say it is not amenable to inexpensive and partial solutions. The lesson of nearlynet is that connectivity is rarely an all or nothing proposition, much as would-be monopolists might like it to be. Instead, small improvements in connectivity can generally be accomplished at much less cost than large improvements, and so we continue growing towards permanet one nearlynet at a time.