ideas for developing a semantic web for the people and by the people

Version française

The principles of emergence and collective intelligence have amply proven—and even with distinction—their capacity to bear interesting fruit on the Web (Wikipedia, del.icio.us, Digg, Flickr and many others). Now, this may have already been suggested elsewhere—the idea being something you can almost feel in the air—, but this approach seems to me to be perfectly suited to the semantic Web.

Although the idea of a semantic web has been kicking around for some time, we have not yet managed to apply it in our everyday practices. Tim Berners-Lee coined the term in 2001, but the idea had already been in the public sphere for a long time, for example in The Hitchhiker’s Guide to the Galaxy, by Douglas Adams. I personally believe that the semantic web must be emergent and must be a collective enterprise in order for it to be able to make its way into our communication practices. Otherwise we would only be dealing with “services talking amongst themselves”.

It is becoming apparent that microsyntax does quite well on Twitter and other social media. Microsyntax is undeniably semantics. Many users are ripe. What am I saying? Even without referring to the use of little useful symbols such as slashtags and hashags on social media, all beings who communicate in good faith are using semantics—or trying to.

Now, it seems to me that the semantic Web could very well emerge from our very communication practices. Remember the fairy tale in which a genie comes out of a ring when you repeat his name three times? Here we’d be seeing a similar phenemonon…

Imagine that you wished to…let’s say…find a wrought-iron table for your garden, meet your soul mate, let potentially interested people know that you will be driving from Montreal to Quebec on Thursday and that you still have room for two people…

your personal communication dashboard

To begin with, in this age of intelligent telecommunications, I think it is a good idea to equip oneself with a sort of personal dashboard. It could be both the starting point and the ending point of everything. On that dashboard you could make, following the current example, separate entries for each of your wishes, each entry being formulated in at least two different ways (for example in different languages), and in more than two different ways if possible, in order to help the emergence phenomenon along. (Do you understand yet? No? Keep reading…)

To further help the system, we could reduce the number of possible formulations to a practical sub-set by assuming that each entry begins with “I wish to” (or its equivalent), and then continues with a verb, possibly a pronominal one.

The following is an example of what I mean:

  • acquire a garden table made of wrough-iron // have a wrought-iron table for my garden // obtenir une table en fer forgé pour mon jardin // get a wrought-iron table; it’s for my garden

Notes: “I wish to acquire” and “I wish to have” are not, strictly speaking, equivalent—but, communicationally speaking, they both point to a request.  The double slash obviously operates as a delimiter. In order to keep things simple, periods at the ends of sentences are dropped in favour of semicolons, which divide the wish formulations into several sentences. The process of emergence is also enhanced when the pieces of each given formulation are in one-to-one correspondence with the pieces of each of the other equivalent formulations. The pieces of a given formulation may be composed of non-contiguous parts of that formulation.

One of the dashboard’s practical uses is that it can display all your wishes at a glance, so you can make sure they’re up to date. Thus, by adding and removing wishes, you can make your way with assurance… communicationally speaking, of course.

A widespread use of this simple format would permit, thanks to algorithms I like to call emergeware, the emergence of a semantic repertory made up of more and more of our formulations—and meanings thereof.

A good approach to universality might be a good approach to the universe.  There it is, hidden beneath the rivulets of our voices and our signals. It scintillates and glitters—can you see it?

an emergent repertory of semantics

Our algorithms, our emergeware, would sometimes have to be very ingenious to accurately guess which parts of the given formulations correspond to one another, but a whole bunch of such correspondences could be guessed at by algorithms that are not all that complex. We can gradually go from simple formulations to subtler ones. It would be a work in progress, constantly improving and changing. Nothing gained would ever be lost.

So, in so far as the formulations given as equivalent really are equivalent (and that can be verified by asking users to validate or invalidate whether this or that formulation really equates with what they meant), relatively simple algorithms could identify co-occurrences and infer from them what, in a given formulation, corresponds to what, in another one, somewhat as is done in the game Master Mind.

Those elements of formulation identified as communicationally equivalent would arrange themselves into chains, the various nodes of which would eventually also be confirmed as equivalent. Those chains would thus crystallize into tight, highly interlinked little networks, each of them being like a constellation that is clearly distinct from all the other constellations. Each constellation would correspond to a concept, such as “sharing” or “sheep”. A vast catalogue of communicational elements could, in this way, emerge from our communications.

Each concept would have a separate entry in this catalogue, as in a dictionary or in an encyclopedia. For example, there would be the concept of tree, of fruit tree, of apple tree, of travel, of sheep, etc.  The catalogue would eventually resemble a multilingual encyclopedia (eventually a multimedia one) that wouldn’t even need definitions (though definitions are also ways of expressing concepts…). Then, in the course of time, for each concept there would be images, video clips, audio recordings, and of course the many ways of saying things, writing things, as well as all the ways of expressing oneself that people use.

Such a catalogue would make possible the automatic translation of a great many messages in a great many languages. Such an undertaking might also elicit new forms of writing, even new forms of language, indeed, new modes of communication altogether!

a collective affair

Oh, and one last detail is missing from this system. The use of free-tagging, or folksonomy, has made it clear that categorizing is a collective affair.  The categorization of our elements of communication will depend on a pretty similar kind of volunteering. “Apple tree” will thus be said to belong to the category “fruit tree”; they will both be said to belong to the category “tree”, and so on. “Free generalizations”, as in the above example of  “to acquire” and “to have”, would then be timely.

Just as an element of communication can categorize another, a category of elements of communication can be an adequate answer to another category of elements of communication. For example, the category of “supply” (offers, availability, talents, etc.) can be connected to that of “demands” (needs, wishes, conditions). This type of information, provided by volunteer semanticians, or free-ontologizers, would be the basis for the automatization of innumerable relevant communications.

The categories most suited to these kinds of dialogue would be those such as demands, offers, know-how, talents, interests, contact hours, locations, itineraries, geographical ranges, favourite meeting places, along with many other types of information.

openness, decentralization, universality

As long as the information involved is open (as in “open source”), the creation of such an emergent semantic Web need not deal with the thorny issue of privacy. There is already so much that can be said in the open (not to mention what ought to be said in the open). Later on, privacy options could be arrived at by consensus, and therefore also by emergence. However, if privacy options have been well thought out and implemented from the start, why not go ahead with them? The Diaspora Project, currently being developed and due to be released in September, 2010, seems promising to me. One way or another, a universal communication system will bring about an appreciable change in the state of openness of this world.

Information would thus be decentralized and present everywhere—disseminated on our dashboards—, whilst the semantic catalogue would emerge from the harvesting of this information and from our free ontologization of that harvest.

The search engines could also be de-centralized. Each dashboard could exchange information with neighbouring dashboards within a certain range. These search engines would fetch information from an emergent catalogue of semantics like the one described here (as well as from other dictionaries and ontologies) in order to determine which elements are an adequate answer to which other ones. The results, that is, other users’ relevant entries, would also appear on your dashboard, of course.

This emergent communicational catalogue—of semantics, of elements…? I’m not sure what to call it, but we sure will have to find a name for it, because this catalogue may one day be the most universal thing there is, second only to the universe itself.

building our world more fluidly

Such a tool, built by the people and for the people—brick by brick, so to speak—, a tool we would understand and that would understand us, a tool that truly gives priority to our well-being and to the realization of our dreams, such a tool could make a huge difference in our world. Its mere existence would give rise to greater opportunities for living and creating—in radically emergent and fluid ways.

Our social structures could then be in our very image, a better, more logical, more pluralistic reflexion of us. They could sure be less cumbersome! They could come into being by free association through the convergence of our wishes… and disappear when no longer wished for.

Am I advocating an all out individualism? No, for although I value individuals over cultures, I believe that solidarity should be encouraged as much as possible. (As a matter of fact, we will have to consider ways of helping people with communication disabilities to formulate their wishes.) Our desire to give to the community is all the greater when we feel we have a part to play in it, when we are free to navigate within it, and also to the extent that we are able to have a real influence on it. And because true solidarity is always a voluntary affair, we need to begin by building our world according to everyone’s wishes, and not just some pre-established order. I shall continue to develop my thoughts on this matter in a follow-up article, dealing this time with a model for a wholesome economic system.

Our problems would not all magically go away with a people’s semantic Web. However, the magic of communication would at last be at the people’s command and could—or so I wish—bring about such clearness and fluidity that it would become the instrument of choice for participating in the making of our world.

About these ads

5 comments so far

  1. fredofromstart on

    I would appreciate to read your criticisms, comments and questions down in black and white, so that I can answer them and, eventually, improve my proposition.

  2. spencerhold on

    hey fred, thanks, I think your vision is terrific, and inevitable.
    I specifically like the idea of using web identifiers for unambiguous communication. I say ‘i think the tuba concert was awesome’ and
    a) it knows of the concert, and
    b) logs and understands my evaluation,
    c) responds in a intelligent way (finds agreements/disagreements, or whatever)

    I don’t wanna cheerlead too much for freebase, as it is a for-profit company, but it’s got a) pretty well covered. I’m working on a bookmarklet that lets people add a topic to freebase quickly, so if you’re on the website for Montreal Festival de Jazz 2010, you can tell freebase its a music concert, an event in July in montreal etc… one click.

    natural language processing is tough but not impossible, and all the fancy nlp demos that have been made since we were toddlers are perfectly able to handle a regular query like ‘i want an iron table’. wordnet has been connected to dbpedia, and is (as we speak) being connected to freebase http://wordhunger.freebaseapps.com/
    ya, and hashtags show that atleast savy twitter-people are willing to use semi-natural language, in order to be understood programmatically. personally i don’t see how useful this sort of user markup is unless theres a two-way between user and computer, i really like this style of nl system – http://cubed.freebaseapps.com
    where its like – ‘so you mean this right?’ in a not-too-troublesome way.

    as for a dashboard, a general semantic internet interaction system, writing software for it will be super fun. i hope the internet gets its act together soon, so i can finally find a non-snoring, large-breasted french girl that laughs at my jokes
    cheers

  3. Felix Pleșoianu on

    I’m surprised you didn’t mention microformats (or RDFa, since you seem to prefer a more formal approach). They are already in widespread use, are inherently distributed, lightweight, open… pretty much all the qualities you want. There is even one for the specific use case in your example: http://microformats.org/wiki/hListing — how about that?

  4. fredofromstart on

    Thanks to Sébastien Paquet, who mentioned my article in his Blog http://emergentcities.sebpaquet.net/ (see : Blueprints for Networked Cocreation: 1. Intentcasting). It is a very interesting series on the emergence of new ways of doing things together. Big things, like enterprises, or societies. I find it very valuable, especially in these times of all-out agonizing conflagration…

    I recommend reading it from the bottom up, in the order it has been written.


Laisser un commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion / Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion / Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion / Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion / Changer )

Connexion à %s

Suivre

Recevez les nouvelles publications par mail.

%d bloggers like this: