ideas for developing a semantic web for the people and by the people
The principles of emergence and collective intelligence have amply proven—and even with distinction—their capacity to bear interesting fruit on the Web (Wikipedia, del.icio.us, Digg, Flickr and many others). Now, this may have already been suggested elsewhere—the idea being something you can almost feel in the air—, but this approach seems to me to be perfectly suited to the semantic Web.
Although the idea of a semantic web has been kicking around for some time, we have not yet managed to apply it in our everyday practices. Tim Berners-Lee coined the term in 2001, but the idea had already been in the public sphere for a long time, for example in The Hitchhiker’s Guide to the Galaxy, by Douglas Adams. I personally believe that the semantic web must be emergent and must be a collective enterprise in order for it to be able to make its way into our communication practices. Otherwise we would only be dealing with “services talking amongst themselves”.
It is becoming apparent that microsyntax does quite well on Twitter and other social media. Microsyntax is undeniably semantics. Many users are ripe. What am I saying? Even without referring to the use of little useful symbols such as slashtags and hashags on social media, all beings who communicate in good faith are using semantics—or trying to.
Now, it seems to me that the semantic Web could very well emerge from our very communication practices. Remember the fairy tale in which a genie comes out of a ring when you repeat his name three times? Here we’d be seeing a similar phenemonon…
Imagine that you wished to…let’s say…find a wrought-iron table for your garden, meet your soul mate, let potentially interested people know that you will be driving from Montreal to Quebec on Thursday and that you still have room for two people…
your personal communication dashboard
To begin with, in this age of intelligent telecommunications, I think it is a good idea to equip oneself with a sort of personal dashboard. It could be both the starting point and the ending point of everything. On that dashboard you could make, following the current example, separate entries for each of your wishes, each entry being formulated in at least two different ways (for example in different languages), and in more than two different ways if possible, in order to help the emergence phenomenon along. (Do you understand yet? No? Keep reading…)
To further help the system, we could reduce the number of possible formulations to a practical sub-set by assuming that each entry begins with “I wish to” (or its equivalent), and then continues with a verb, possibly a pronominal one.
The following is an example of what I mean:
- acquire a garden table made of wrough-iron // have a wrought-iron table for my garden // obtenir une table en fer forgé pour mon jardin // get a wrought-iron table; it’s for my garden
Notes: “I wish to acquire” and “I wish to have” are not, strictly speaking, equivalent—but, communicationally speaking, they both point to a request. The double slash obviously operates as a delimiter. In order to keep things simple, periods at the ends of sentences are dropped in favour of semicolons, which divide the wish formulations into several sentences. The process of emergence is also enhanced when the pieces of each given formulation are in one-to-one correspondence with the pieces of each of the other equivalent formulations. The pieces of a given formulation may be composed of non-contiguous parts of that formulation.
One of the dashboard’s practical uses is that it can display all your wishes at a glance, so you can make sure they’re up to date. Thus, by adding and removing wishes, you can make your way with assurance… communicationally speaking, of course.
A widespread use of this simple format would permit, thanks to algorithms I like to call emergeware, the emergence of a semantic repertory made up of more and more of our formulations—and meanings thereof.
A good approach to universality might be a good approach to the universe. There it is, hidden beneath the rivulets of our voices and our signals. It scintillates and glitters—can you see it?
an emergent repertory of semantics
Our algorithms, our emergeware, would sometimes have to be very ingenious to accurately guess which parts of the given formulations correspond to one another, but a whole bunch of such correspondences could be guessed at by algorithms that are not all that complex. We can gradually go from simple formulations to subtler ones. It would be a work in progress, constantly improving and changing. Nothing gained would ever be lost.
So, in so far as the formulations given as equivalent really are equivalent (and that can be verified by asking users to validate or invalidate whether this or that formulation really equates with what they meant), relatively simple algorithms could identify co-occurrences and infer from them what, in a given formulation, corresponds to what, in another one, somewhat as is done in the game Master Mind.
Those elements of formulation identified as communicationally equivalent would arrange themselves into chains, the various nodes of which would eventually also be confirmed as equivalent. Those chains would thus crystallize into tight, highly interlinked little networks, each of them being like a constellation that is clearly distinct from all the other constellations. Each constellation would correspond to a concept, such as “sharing” or “sheep”. A vast catalogue of communicational elements could, in this way, emerge from our communications.
Each concept would have a separate entry in this catalogue, as in a dictionary or in an encyclopedia. For example, there would be the concept of tree, of fruit tree, of apple tree, of travel, of sheep, etc. The catalogue would eventually resemble a multilingual encyclopedia (eventually a multimedia one) that wouldn’t even need definitions (though definitions are also ways of expressing concepts…). Then, in the course of time, for each concept there would be images, video clips, audio recordings, and of course the many ways of saying things, writing things, as well as all the ways of expressing oneself that people use.
Such a catalogue would make possible the automatic translation of a great many messages in a great many languages. Such an undertaking might also elicit new forms of writing, even new forms of language, indeed, new modes of communication altogether!
a collective affair
Oh, and one last detail is missing from this system. The use of free-tagging, or folksonomy, has made it clear that categorizing is a collective affair. The categorization of our elements of communication will depend on a pretty similar kind of volunteering. “Apple tree” will thus be said to belong to the category “fruit tree”; they will both be said to belong to the category “tree”, and so on. “Free generalizations”, as in the above example of “to acquire” and “to have”, would then be timely.
Just as an element of communication can categorize another, a category of elements of communication can be an adequate answer to another category of elements of communication. For example, the category of “supply” (offers, availability, talents, etc.) can be connected to that of “demands” (needs, wishes, conditions). This type of information, provided by volunteer semanticians, or free-ontologizers, would be the basis for the automatization of innumerable relevant communications.
The categories most suited to these kinds of dialogue would be those such as demands, offers, know-how, talents, interests, contact hours, locations, itineraries, geographical ranges, favourite meeting places, along with many other types of information.
openness, decentralization, universality
As long as the information involved is open (as in “open source”), the creation of such an emergent semantic Web need not deal with the thorny issue of privacy. There is already so much that can be said in the open (not to mention what ought to be said in the open). Later on, privacy options could be arrived at by consensus, and therefore also by emergence. However, if privacy options have been well thought out and implemented from the start, why not go ahead with them? The Diaspora Project, currently being developed and due to be released in September, 2010, seems promising to me. One way or another, a universal communication system will bring about an appreciable change in the state of openness of this world.
Information would thus be decentralized and present everywhere—disseminated on our dashboards—, whilst the semantic catalogue would emerge from the harvesting of this information and from our free ontologization of that harvest.
The search engines could also be de-centralized. Each dashboard could exchange information with neighbouring dashboards within a certain range. These search engines would fetch information from an emergent catalogue of semantics like the one described here (as well as from other dictionaries and ontologies) in order to determine which elements are an adequate answer to which other ones. The results, that is, other users’ relevant entries, would also appear on your dashboard, of course.
This emergent communicational catalogue—of semantics, of elements…? I’m not sure what to call it, but we sure will have to find a name for it, because this catalogue may one day be the most universal thing there is, second only to the universe itself.
building our world more fluidly
Such a tool, built by the people and for the people—brick by brick, so to speak—, a tool we would understand and that would understand us, a tool that truly gives priority to our well-being and to the realization of our dreams, such a tool could make a huge difference in our world. Its mere existence would give rise to greater opportunities for living and creating—in radically emergent and fluid ways.
Our social structures could then be in our very image, a better, more logical, more pluralistic reflexion of us. They could sure be less cumbersome! They could come into being by free association through the convergence of our wishes… and disappear when no longer wished for.
Am I advocating an all out individualism? No, for although I value individuals over cultures, I believe that solidarity should be encouraged as much as possible. (As a matter of fact, we will have to consider ways of helping people with communication disabilities to formulate their wishes.) Our desire to give to the community is all the greater when we feel we have a part to play in it, when we are free to navigate within it, and also to the extent that we are able to have a real influence on it. And because true solidarity is always a voluntary affair, we need to begin by building our world according to everyone’s wishes, and not just some pre-established order. I shall continue to develop my thoughts on this matter in a follow-up article, dealing this time with a model for a wholesome economic system.
Our problems would not all magically go away with a people’s semantic Web. However, the magic of communication would at last be at the people’s command and could—or so I wish—bring about such clearness and fluidity that it would become the instrument of choice for participating in the making of our world.