Saturday, November 25, 2006

 

Torching Doctorow: Part 1

Cory Doctorow wrote an article in 2001 called "Metacrap" where he purports to expose seven fatal flaws to reaching a "metatopia" where there is a "world of exhaustive, reliable metadata."

Before we dig into his 7 "straw-men", let's examine a little metadata about Cory Doctorow to attempt to determine his qualifications for making such assertions. First, it is interesting that he does not post any qualifications or any references to back up any of his assertions in the article. We do not know how much, if any, metadata he has actually created. Secondly, wikipedia claims that he only has a high school degree. Third, he is certainly not a practicing IT professional, per his own website. Given the above three things, we must take his assertions with a big, grain of salt. However, given that there are a number of blog entries and links to his article it is worthwhile to at least examine his arguments. Of course, publicity is not any indicator of truth and I surmise that most links to his article are merely people commiserating the fact that metadata is hard to do right. Fortunately, there are those of us that still believe that "hard" does not equate to "wrong" and that this is a temporary state due to lack of expertise. More on that later ... let's get back to Doctorow.

Strawman #1. People Lie. Doctorow uses this to attack the reliability of metadata. His argument is that because metadata "lives in a competitive world", people will lie to gain advantage. Frankly, this is a ridiculous statement because all metadata does NOT live in a competitive world. In fact, the most important metadata, or enterprise metadata, will not live on the "wild, uncontrolled internet" but in the controlled, corporate intranet.
So, let's debunk this in a number of ways:
a. People without access to my metadata can lie all they want and it won't affect me.
b. People lie more when they can lie without attribution. That is why Wikipedia has so many problems (lack of attribution).
c. People lie about both data and metadata but Doctorow is not saying we should distrust the entire internet.
So, the real point here is that Non-attributed data and metadata, of any type, is untrustworthy. Fortunately, every good metadata development process includes attribution via governance so this is truly a tangential argument (at best).
d. Metadata is not the victim of people lying but a cure to that problem. For example, a metadata attribute of "reliability" which is used in a number of very credible organizations is quite effective in measuring the trustworthiness of the source of information. Of course, in some cases, capturing lineage can replace additional metadata attributes for judging reliability.

Daconta's Counterpoint #1. Reliable data and metadata is properly attributed.

Next time we will examine Straw-man #2...





<< Home

This page is powered by Blogger. Isn't yours?