Greasemonkey and microformats (2)

Welcome back! Today I am going to rant —briefly, I promise— about microformats and try to cobble together a simple one to denote measurement units. So there!

As I said in the previous post, microformats are small pieces of self-significant HTML that may be used to embed automatically parseable information in a web page. They are just POSH, but not in the fake Port Out, Starboard Home way: any given microformat is a subset of HTML, and a significant subset (in a semantic way), at that! Let’s try to illustrate that with the simplest of microformats, rel-tag.

This microformat is so tiny it might be called a nanoformat instead. It just consists on one HTML attribute, the little known rel. Valid for links (<a> and <link> tags), rel describes the relationship of the current document to the anchor specified in the href attribute of the tag. There is a bunch of suggested possible values in Section 6.12 of the HTML 4.01 specification, and all rel-tag does, syntax-wise, is adding a value to that list: tag.

The general idea behind rel-tag is providing robots with an easy way to tag content, therefore adding some much needed common sense to searches. The tag itself can come from a variety of tag spaces, one of the most evident ones being Wikipedia. Here is a tag example (extracted from this very blog) using a self defined namespace:

<a rel="tag" href="">javascript</a>

In the process of incubating a microformat, the first thing to do is to resist any urges at design for design’s sake. These wise words notwithstanding, I’d like to tell you of this little idea of mine: what about a microformat for measurement units? I don’t want to cheat on anybody by asserting it hasn’t been proposed before, because it has. Trouble is, discussion stopped on it without arriving at a significant consensus several months ago (more than eight, less than ten). I believe there is a real world problem to solve, and not much done in the form of in-the-wild implementations. If anything is about to gain any traction, it should be simple. Dead simple. What about this?

<abbr class="unit EUR" title="1320">1&thinsp;320&nbsp;&euro;</abbr>

It’s just a simple application of two recommended design patterns: class and abbr. The former one seems to be pretty well established, however the latter has some controversy behind it. But I wouldn’t mind having a <span> element there, to be honest. An explanation could be of some help at this point.

The <abbr> element allows, by means of its title attribute, to provide a machine parseable value for the eminently presentational string inside. For (non-vision impaired) humans, the string appears as “1 320 €”: the contents of title, being a straight string representation of the value, allows robots to skip the complexities of the human mind (and, perhaps, to read the number aloud correctly).

The class attribute provides discerning parsers with two fundamental pieces of semantic information: it’s an unit, and the unit is a currency, namely euros. As a valid HTML class literal can be nearly anything (spaces and other small quirks excluded), I’d stay with unit names acceptable to one of the more popular unit name parsers out in the Net: Google Calculator. The rationale of this proposal, and the Greasemonkey bit, to be explained in a further article. See you!

Greasemonkey and microformats

This is going to be the first of a tiny series of overly technical posts which may only matter to a number of persons so small it may indeed turn out to be negative. Meaning this might (just might) not matter even to me. Anyway, it’s been fun in a twisted way. Enjoy.

Greasemonkey and microformatsGreasemonkey is one of those geeky metatools. Available for Firefox, it empowers users to play with pages by conditionally applying scripts to them. This enables all kinds of transformations, juggling and right out mayhem. It’s not my intention to delve into the depths of Greasemonkey, or even explain it in any meaningful depth: this is infinitely best done in the wonderful book Dive Into Greasemonkey (and this companion Greasemonkey pitfalls article). I’ve just come up with a neat crossover with microformats which I want to share with you.

Huh? Microformats, you say? Indeed. Microformats, as per their main web site, are…

[…] a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging).

In other words, microformats are (one of) the wave(s) of the future! It’s not just Web 2.0, it’s Semantic Web! While it seems I’m babbling like a teenager marketing droid on steroids, there is a nugget of truth in there. Microformats enable us to annotate content, so people can read it and computers too. Someone might write a parser for microformat X able to extract RDF triples and, ultimately, knowledge for the network. Search engines can parse them to provide meaningful answers to queries like what is an XML namespace (as if it mattered! *laughs*) or what is time (who needs Stephen Hawking when you can have Google?)

However, all these niceties don’t add up to much by themselves. What about a real-world problem solved by microformats? Greasemonkey and some healthy dose of naïveté on my part may help to ease the trouble with automated unit conversions. All of them! Next stop, a measurement microformat.