Q&A: Chris Messina on microformats and the semantic and social webs

    1 of 1 2 of 1

      Chris Messina advocates for an open Web. He works on the DiSo Project, is on the board of the OpenID Foundation, and helped create BarCamp. Last month, he joined the advisory board of Microsyntax.org.

      On Friday (June 12), Messina will deliver a presentation about openness on the Social Web at Open Web Vancouver 2009. The two-day conference at the Vancouver Convention Centre begins on Thursday (June 11).

      In a telephone interview from his San Francisco home, Messina spoke to the Georgia Straight about microformats, microsyntax, open government, the Semantic Web, and the open Social Web.

      What was your role in the creation of microformats?

      Well, I was basically a community member from very, very early on. I was involved in starting this event called BarCamp. One of the cofounders of the event, Tantek í‡elik, was one of the early proponents of microformats. I worked with him sort of on the community and stuff like that. So, he kind of got me into microformats early on because I had been working with a lot of the Mozilla people, working on Firefox, and then helped to start this browser project called Flock.

      To me, it seemed like Flock and microformats were like the perfect opportunity to merge very-easy-to-implement standards with sort of a browser that would actually be able to understand and make use of those things for social purposes. So, the original idea for me with microformats was to make it possible to both publish and consume microformats with Flock. So, Flock was going to be one of the first sort of read-write browsers that actually had the concept of publishing content directly built into the interface, and it would automate and make it very easy for you to publish microformatted content. But I sort of left before that vision was ever realized.

      What do you think of RDFa?

      In general, I’ve been historically somewhat skeptical of the Semantic Web, both little-S and big-S, largely because I just haven’t seen a lot of real-world applications developed using the technology. There’s stuff that happens kind of in the medical realm and there’s very specific applications in places like that. But it just feels like you have to do a whole lot more work to get the same level of benefits that you’d get out of just using, you know, microformats on the one hand.

      The other thing is that, from a Web developer perspective, I think that any time you add complexity that favours computers to a publishing format, you’re liable to really complicate things and impede adoption. We’ve seen that it’s taken four years for microformats to actually get off the ground and get adoption with Yahoo and Google. I can only imagine how much longer it’ll take really to get some of the more esoteric RDFa formats to really see widespread adoption.

      That said, I’m very interested in general in ways of adding semantics to data on the Web. So, I’m not sort of religious in the sense that, you know, it’s microformats or nothing. But, from a design perspective, which is actually my background, I’m more interested in figuring out ways to make formats easier to publish and to, ideally, consume without a lot of new learning that needs to happen.

      So, to you, microformats aren’t necessarily part of the Semantic Web?

      Oh, they are. I mean, they definitely add semantics to the Web. But they’re not part of the Tim Berners-Lee-glorified view of kind of linked data, where everything is a tuple, which sounds really nice from an academic perspective. But in the wild, I think, things are just a lot more brittle than that. I’ve just not seen a lot of people really excited to work with them, except to play with formats but not build real applications.

      Do you think that World Wide Web Consortium vision of a Web of linked data will ever happen?

      I think there is definitely room for something like a Web of linked data. But we kind of already have it. The Web of linked data that Tim Berners-Lee talks about might make more sense from a backend or single-service-provider perspective, where you want to do stuff on the server side. But, from like the public Web, I think that it’s just—you know, as it is, it’s really hard to get people to publish well-formed HTML. I mean, the kind of stuff that browsers deal with is disgusting. So, if you actually think that people are going to publish well-formed RDF, I think that you’re kind of naí¯ve to the way that stuff actually gets out there.

      So, it’s even a big sort of wait-and-see with regards to microformats today. But at least we can build it into the publishing tools, make it easier in that way. I guess that is one of the areas where RDFa can make some sense—if Drupal adds support for it and stuff like that. But it’s really just data, and adding the semantics really shouldn’t be that much of a big problem. So, to your question about linked data and the Semantic Web Tim Berners-Lee envisions, we have approximations of it, and it’s not nearly as flexible as the vision that he has for it. But the problem is that we’re still looking for use cases.

      So, I actually spent some time with Tim at a Foo Camp in Boston a couple months ago. I had a session on what are called activity streams. So, this is a format that I’m sort of working on that is an extension to the Atom feed format. The idea is to express actor, verb, and object, which map really well to the RDF triple idea. We could express this kind of information in RDF, but the problem is that no one is actually publishing feeds from Web services and social networks in RDF. So, not only would people have to figure out and wrap their heads around RDF, but now they’d have to use our ontology, which would involve a whole lot more investment that people just aren’t willing to spend right now. So, anyways, we went with sort of a known quantity, as opposed to going the route of RDF.

      After my session, he came up to me and said, “Oh, you know, this is really interesting. But you’re doing it all wrong. You should really be doing this in RDF, and here’s why.” I’m like, “Well, no one’s ever built a social network with RDF.” He was like, “Oh, let me show you. Let me show you. I hacked this together.” So, he opens up his laptop and basically pulls up this Web page, I guess, of his homepage with all these different links with FOAF on the backend and all these different link structures. He’s showing me how he’s clicking through from one person’s page to another, and he’s got a list of his friends and he’s got this and he’s got that.

      It occurred to me that what he was showing me was very similar to Facebook but kind of like from a 1993 perspective, you know when the blink tag was, like, hot. I’m looking at it, and I’m like, “This is really great that you can do this stuff. But you haven’t actually provided me with a more compelling user experience that’s going to help me sell this stuff to anybody but you.”

      So, with these types of technologies and solutions, you’ve really got to start with, I think, the user experience and then work backward and develop technologies from those really, really good, compelling experiences. RDF was developed with the technologies and the formats in mind first—so as the first order of priority—and as a result the technology’s just going to have a really, really hard time getting adoption outside of academic circles.

      In a nutshell, what does an open Social Web look like to you?

      I think it really comes down to, on the one hand, sort of getting to the point where real identity is the way we refer to and interact with people on the Web. I think the other way that we begin with it is where having a social network or operating a social network becomes somewhat invisible. Charlene Li has said, you know, social networks are going to become like air and they’ll be everywhere, and I agree with that. But that doesn’t really provide us with a clear picture of what that means.

      So, the way that I think of it is we will have some sort of identity or way of identifying ourselves on the Web that we can hand out to people, which I think would be an OpenID. Whether that’s an e-mail address or a phone number, it doesn’t matter to me—the representation of the OpenID. It’s just that this is your identity. There are a couple things that will happen from that. First, I think that as you interact with things in the world, if you want to syndicate the activity that you’re participating in back to your identity hub, which will then syndicate it out to your friends so your friends know what you’re doing, that should become more and more possible.

      We see this a little bit today with Facebook. Facebook is very awkward in the way that they’ve sort of implemented it, but it’s kind of getting there. When I leave a comment on various blogs or I Digg something on Digg, it’ll ask me, “Hey, do you want to post this back to your Facebook profile?” For the most part, I don’t. But the difference that I think hopefully will happen is that I will become the central hub, the central nexus, and authority for information about me and about the activities that I’m participating in, and that I will be able to manage my relationships and who I perform what activities to. So, Facebook is actually starting to move in this direction. I think Facebook is doing a lot to really innovate in this space—allowing you to have different friend lists and things like that. But we haven’t yet really jumped off of the computer into the real world.

      One application that’s doing this in a way that I think is actually pretty neat is something called comiXology. I’m a big comic-book nerd, and I’m into comics and stuff like that, and I have been for a while. I found this application actually through the comic-book store that I go to in Hayes Valley. What’s really interesting about it is that they have an iPhone application, they have a Web site—great, awesome—but what’s really clever is that they allow me, as a comic-book buyer, to connect my account, my on-line digital account, with a physical retail store. So, when I add comics to the list of ones that I want to buy, which is called the pull list, or when I subscribe to an ongoing series of comics, the store essentially gets notified of that and they’ll set aside the books that I’ve requested.

      This is a type of connection with the real world that I think social networks have yet to really express that I think could actually have a great deal of promise. So, we think about Facebook Connect today and connecting different Web sites and so on. But what I’m interested in, I think, in the next step, the sort of open Social Web is one where we have these connected devices and we’re actually connecting them to the real world and bringing some of the real world back to the Web and having the Web actually become more distributed into reality. I think location also is going to play a big role in that.

      What role would microformats play in this?

      I think microformats are a way of bootstrapping the Semantic Web. What I’d like to see happen, I think, is—and more importantly, at least to me—is that we have good, expressive schemas for representing data. So, the fact that—and this is one of the things that I think is very valuable about microformats—rather than inventing their own ontologies and schemas, they just went back in time and looked at formats that are already widely deployed and just reused those. One of the other real problems that I’ve seen with a lot of Semantic Web folks is that they love to invent their own ontologies and say, “Oh, we could just map it.” But, every time you say that, there’s extra work that’s involved and with increasingly minimal value being delivered for doing that kind of mapping.

      Anyways, what I like about microformats is let’s use what’s already out there, let’s reuse that, and let’s take advantage of that, and let’s not reinvent things even if we think we can do a better job. So, that’s one part of it. The other part of it is it allows for anybody who has a Web page to instantly have an API, without having to know anything about JSON or about XML formats or things like that. It means that, if you’ve got an events Web site, you can just mark it up in a much more easy way. There are a number of different providers out there—Google and Yahoo today, but I think browsers will get in the game, and we’ve seen this with IE8 supporting hSlice—that will be able to detect this information.

      Even if it doesn’t provide a human-friendly interface on top of the actual microformat data, the way that I was thinking about microformats being consumed by Flock was that, as you just browse the Web, you would see data, and the browser in the background would sort of be like, “Oh, there’s an event. Oh, there’s a person. Oh, there’s a review.” And like that. And sometime later, you would go and you would do a search, and you’d get sort of a Spotlight-style experience where it would actually show you just the reviews or just the places. You know, it’d start to learn about the types of content that are available on the Web and would break those down in terms of your search results. So, it’d be a more passive way of consuming and interacting with microformats. I think that type of stuff becomes a lot more valuable over time.

      On May 21, the City of Vancouver decided to support open-source, open-standards, open-data principles. Any thoughts on how the city should make data available on the Web?

      Well, first of all, I think that’s really great. I’d love to know what those principles are that are guiding their approach. One of the things again from my talk today is what open actually means. An illustrative example is again at this event was at, this Foo Camp in Boston, there were some folks from government there, for the U.S. government. This woman was sort of—I had a session on what does open mean—and she said, “You know, this is really great that we’re having this discussion and you’re bringing this issue up, but unless we come up with some sort of testable definition this is useless.” Because she has basically fought in government for 10 years to get this statement into their procurement document that says, “You will first look for open-source software to solve this problem using non-proprietary formats before you go and buy some proprietary system.” But, of course, the people who are buying the stuff in the procurement offices have no idea what open means, and so they don’t have a guideline that says what it is, and so they just gloss over that and just go buy the Microsoft thing.

      So, it’s not enough to say we need to have open source in government, especially if the data are in formats that are not portable. We need to think about it from a more critical perspective and a more pragmatic perspective to say, well, what we really want to ensure is that there is freedom and competition in the marketplace. So if, two or three years down the road, we find that our vendor is screwing us and charging us all this money and we can’t move our data to another system, because it would cost us X-times the amount of money that we’re already them paying on a monthly basis, that’s a system that works against people, I think. It works against competitive machinations that actually increase improvements in design. So, anyways, I think that’s one part of it.

      You sort of want to know more specifically how to go about maybe either implementing or adopting open technologies, or where to start, I guess, with open government data. I think looking at real-world problems that exist because there’s a lack of good information—and not just good information but information that’s been vetted, that’s gone through sort of government scrutiny, and moreover that taxpayers have paid for. So, there’s a lot of information, I think, in the geographic-information-systems area that needs to be made available, so every block is a really good example of an application of applying government data—government-sponsored data—in a way that people can actually just go through and say, “Hey, what’s going on in my neighbourhood? Or, historically, what’s been the situation here in the last 10 years? What are things that happened here?”

      I think also another way to think about doing the approach to open data from a government perspective is again to try and look at other organizations that are doing similar types of things and adopting APIs and API design patterns that already exist. Having your own version of a geodata API—well, it’s like Google already has one or let’s say the City of New York already has one—really doesn’t add a whole lot more value and actually increases the cost to people using your specific set of data.

      So, the pragmatic example of this is there’s a service that used to exist called Ma.gnolia. It was a social-bookmarking service. They supported microformats and all that, which is awesome. But they were a client of mine, and it was two-person start-up—half of it was actually in Vancouver, which might be interesting to you. But one of the first things that I did, when I got to them, was I looked at their API, which was very rich and very expressive and did all these great things, and I said, “That’s great, but no one’s using it. What you need to do is you need to go and mirror the Delicious API, so that someone could just take their application that works against Delicious and replace Delicious.com now with Ma.gnolia.com and have the application just work as though it’s completely agnostic to the vendor.” That actually was the start when Ma.gnolia, I think, really started to see some momentum in its API use, because it really lowered the barrier and the cost to developing against it.

      So, even though it was more powerful, that didn’t matter. What mattered was that it was easy and efficient for developers to take the work they had already done and make those projects more valuable. So, in a similar way, I think it’s imperative for open-government initiatives to really look at the work that’s already happened, even if it’s not as good as what they might do, and consider how they can align themselves with it and then work with the folks that have already done that work to then make more improvements.

      You invented hashtags in 2007, and you just joined Microsyntax.org as an advisory-board member. So, what do you think is the potential of microsyntax? Is that something that you think people actually will be doing themselves or will computers do it for them?

      That’s a really good question. My interest in the Microsyntax Web site and project has a lot more to do with, I think, wanting to create a space—and I kind of wrote this in my blog post—to create a space in which conversations around these syntaxes can actually happen. I think that there probably is very a subtle recognition that every new SMS service that comes out shouldn’t be inventing it’s own syntax, because frankly people are going to learn one and then get confused and use someone else’s syntax on someone else’s service and probably get embarrassed. They’ll do like “DM” instead like—I don’t know—Brightkite had a different way of sending messages. They used “MSG”. So, if you did “MSG username” you could send a direct message to somebody on Brightkite, whereas of course Twitter is D...and that caused confusion. So, it’s like, why are you inventing your own syntax? I understand that there are different needs for different services. But insomuch as we can reduce the proliferation of syntaxes by creating this space—and I think there’s an opportunity here.

      Really, it goes back to looking at what the microformats have done in a similar way with HTML. So, of course, there are going to be, like RDF, you can have any number of different types and ways of expressing the semantics of information. But the value really is in actually getting consensus and getting more people to adopt one way of doing things—even if it’s not the best thing. That’s not what’s most important. So, in a similar way, it’s like let’s take a look at and document the syntaxes that are already out there and let’s take a look at what people are already trying to express through these systems and then codify that, so we can actually reduce confusion in the marketplace.

      Secondarily, even if we’re not talking about Twitter—and I think this is what’s critical about Microsyntax and why it’s not called TwitterSyntax.org—is that there’ll be other places where you might want to do these direct messages to a service and express something that is very contextual. So, it might not even be in a public message. So, I’m not really that interested in all the different ways of expressing different things in Twitter messages that are public. I think the hashtag is very useful, largely because it solved a problem that I had. But these location things and stuff like that, I’m a little more dubious about.

      However, the other day—I sort of love this anecdote and I wrote about it in my blog post as well—I tweeted sort of bitching about how the Alaska Air Web site didn’t load and didn’t work on my iPhone. I was trying to check in, I was on the way to the airport, and I was freaking out because I was late. I was like, “Oh, I’m just trying to check in here, I need to make sure I can get this flight.” So, I Twittered about it, and, lo and behold, five minutes later Alaska Air on Twitter sends me a direct message and says, “Oh, it looks like you’re having a problem with our Web site. Send us your confirmation number and the number of bags you want to check and we’ll check you in.” I was like, “No way.” So, they totally checked me in via Twitter, which I thought was incredible.

      But, moreover, you can imagine a service like TripIt or Dopplr taking care of that service by offering a syntax that would just allow me to, say, check in with an airport code, and they would just do the work for me. And that would work across multiple services, and I wouldn’t have to learn a new syntax every time I got to a new airline or what have you. So, it’s things like that that are a real opportunity for us with Microsyntax to just create that convening space—that conversation space—in which we can actually talk about these kinds of things that people want to express.

      Anything you want to add about microformats or the Social Web?

      Well, I start a lot of my talks out recently talking about how Web 2.0 is not over yet. On the one hand, it’s really important to sort of point out that the 2.0 in Web 2.0 is not actually a version number. It was sort of more intended as, “Hey, look, we’re going to start over. We’re sort of level-setting at this new place.” So, what’s important is to look at what Tim O'Reilly was talking about, which is really just that the network is the new platform and that applications get better the more that people use them. Those are sort of the hallmarks of Web 2.0. We haven’t actually gotten to that point yet where every Web site is getting better with more people using it, nor are we even to the point where most Web sites allow people to connect and use them in a way that’s social.

      So, I guess from that perspective there’s still a lot more work to be done. The rise of the real-time Web is changing things. It’s definitely going to create new opportunities but also new challenges. The ubiquity of what are going to become very high-powered devices with very capable browsers is going to become a very interesting opportunity, and a lot of this stuff we just don’t know how it’s going to play out. But I would say that we’re actually in sort of the first third of what would overall be the Web 2.0 era or generation and that calls for its demise are much overstated at this point.

      You can follow Stephen Hui on Twitter at twitter.com/stephenhui.



      Michael Hausenblas

      Jun 6, 2009 at 10:38pm

      Just want to clarify one thing: Stephen's questions referring to 'a World Wide Web Consortium vision of a Web of linked data' is incorrect and misleading. As a matter of fact TimBL developed the principles and the LOD initiative started as a W3C SWEO IG activity. However, I want to stress the point that the current linked data community, though partially benefiting from W3C infrastructure (mailing list) is a grass-root movement with many members, independent of any W3C process and a self-organizing entity. I guess you could have also asked about 'Microsoft's vision of a Web of microformats', just because one of the founders used to work for them.