How to blog in more than one language

The Spanish blogosphere looks weak. But language is accidental, automatic translation is ubiquitous, and the relative importance of content creator language is shrinking day in day out. However, when considering translation issues like those illustrated in the last post, perhaps some human intervention is in order to bridge the gap between semantic worlds.

To assist in this task, I have prepared a Blogger widget that allows writing posts in more than one language, discriminating among them and enabling the reader a clear view of her preferred content. To the user, the scheme resembles somewhat server-side content negotiation, but in a DOM-manipulating, Web-two-dot-zeroish way.

Some usage notes:

  1. First of all, you have to add the widget to your blog. There is a button next to the language selection combo to do just that. Not in Blogger? No problemo. Just download the script langsupport.js, and deploy it to your favorite location (it can be included in your template code, or anywhere in the HTML of your pages with a <script> tag, it is not too heavy, moreso if you use JS Minifier on it). You should at least remove the sentence with the comment Only needed when code is running inside a Blogger widget to make it work successfully outside Blogger.
  2. If not hosted on Blogger, you should also provide an anchoring point for Javascript to know where to insert the language selector combo. It should look like this:
    <span id="__langSupport__"></span>
  3. Denote language dependant HTML content with the lang attribute, using ISO 639-1 language names (there is no obligation to do so; however, the default language feature will malfunction). For instance:
    <p lang="en">This paragraph is in English.</p>
    <p lang="es">Este párrafo está en español.</p>
  4. Enjoy!

Things to keep in mind:

  • Unmarked HTML elements (those without lang attribute) will be left untouched.
  • A tag marked with a language which is not currently displayed will hide all its contents, even if some inside tag is marked with the currently displayed language. To avoid trouble, do not nest tags with different lang attributes.
  • If there are no lang attributes in the document, the combo selector will not appear.
  • The first language selected depends on default system locale. Therefore, if there is not content in that language on the page, the default language is selected (English, or ‘en’); if there is no content marked as the default language, the first one available (in lexicographical order) is chosen.
  • Unfortunately, the browser’s preferred language configuration is not exposed to the Javascript engine, AFAIK, so that setting cannot influence the default language here (it is only used for content negotiation with specifically configured servers): the default operating system locale is used instead.
  • This has been tested with Firefox 2. Never mind if IE also groks it.