Pipes vs Spreadsheets: a showdown

It’s time to get techie again! As you may have noticed, I’ve recently added a new item to my sidebar content called Puede que esté en… (that’s Spanish for I might be at…, for my few Spanish-impaired readers). The location displayed below, a link to Google Maps, is an approximate rendering of my current whereabouts, and it’s aggregated from three different sources using Yahoo Pipes:

The three streams are merged and a result is selected according to a priority rating. Current version has a simple fixed implementation: timetable data has the lowest rank, while twits have the highest. It’s a rather complex use for Yahoo Pipes, featuring a master pipe with three subpipes with a pretty high module count. Have I started to reach the limits of the Pipes platform concerning pipe intelligibility?

Musing about combined uses for Pipes and Google Spreadsheets I came across an interesting idea: both platforms, or at least a subset of them, might be isomorphic. Two equivalent languages. Could I implement the same feed mixing goodness in a more compact and expressive way by using a single on-line spreadsheet? Pipes themselves are nothing more than connectors, and they could be replaced advantageously in a spreadsheet context with cell references. These can be absolute or relative, may be constructed on the fly and are amenable to indirection (that’s the point where things get complicated in spreadsheets, and one is usually better off seeking greener pastures in real programming languages). There were limitations by both sides, to be sure. At a glance:

  • Nearly every data mingling in Pipes is painful. And with painful I mean extremely painful. Dynamic data type conversion is handled by the platform and regex modules are of great help, but don’t attempt more than a handful substitutions or concatenations in the same pipe or things could go out of hands really soon. And let’s forget about numerical computations. Spreadsheets are a much more natural medium for all that jazz.
  • User input means are much more accessible on Pipes than on its hypothetical Google Spreadsheets counterpart. Input on Spreadsheets means resorting to a complex API built upon HTTP and Atom extensions: the simplest data entry requires a GET request to find a base URL, an authentication procedure and a PUT with dynamically generated XML content. You may end up altering the whole of a spreadsheet remotely in that fashion, but it’s simply not a practical goal for a mashup to attempt: I only have to picture in my mind the oodles of Javascript that ought to be written —and debugged— to roll my eyes in dismay.

The Spreadsheets feature that puts it in approximate equal footing with Pipes are its functions for external data: a catalogue of spreadsheet functions that allow to retrieve content from outside the sheet for processing. The equivalent in Pipes are Source Modules: less painful to use, but at times less capable —it’s hard to beat full XPath XML data extraction, but then such a feature is overkill in most cases. On the other hand, good luck trying to synthesize feed items with Google Spreadsheets. It just doesn’t work that way. But putting differences aside, Spreadsheets should let me aggregate feeds and process them all the better. Publishing options allow me to emit a feed as a result; thus the full Pipes workflow would be realised, in a more compact way. True?

False. There is a big gotcha lurking beneath Google Spreadsheets surface. Let me demonstrate it:

  1. Create an empty sheet.
  2. In cell A1, add the function =NOW(). That should nicely display the current date.
  3. Now, publish your sheet and look for the More publishing options link, way down on the dialog. Click on it, and generate an URL for a cell-based feed for cell A1.
  4. Open that link in any serious browser (one that supports RSS, I mean). What do you see?

The exact same date that appeared on the spreadsheet when you added the current date function. Now, wait an indeterminate amount of time, and retry. What? That’s not current any more! Plus, there is no caching involved. NOW() is not NOW() (I wonder if it should be called THEN()). Trouble is, spreadsheets are completely unable to update themselves unless an actual edit takes place. And this limitation affects external data functions too. The verdict: Google Spreadsheets are totally unusable for the purpose of aggregating, processing and emitting content. Humble Pipes win big this time with its, well, pipes paradigm.

However, Pipes and Spreadsheets can work together provided any dynamic component of the intended result is implemented on the Pipes side. Spreadsheets can just take on a supporting role, a kind of two-dimensional database. Which is a shame, if you ask me. Just think of the possibilities!