Apples and Oranges
There was much talking and chit-chat in the blogosphere recently, with Mark Pilgrim announcing his switch from Mac to Ubuntu in his piece [“When the bough breaks”][1]. John Gruber [responded][4] and Mark posted a follow-up in [“Juggling Oranges”][2].
In my blogroll, there was also Tim Bray [tuning in][3] with a few words in [“Time to Switch?”][3]. The whole debate on whether to switch or not – and the whole polemy revolving around the different arguments – actually did not occupy me that much, but while reading these four pieces on the topic,1 I had some thoughts on the problems of Long-term data preservation and especially the case of the loss of meta data that Mark might experience in switching from Mail.app iPhoto, iTunes and iMovie to the Ubuntu counterparts.
I want to share these thoughts here, for one because I haven’t posted a longer piece for some time now; else, because I think it might add to the discussion and finally, because I love to troubleshoot problems.2
Mail.app legacyThe first thing that struck me was the problems [Mark will face][2] when moving his mail from Mail.app’s emlx format to something open and documented.
Mail.app 2.0 helpfully auto-converted all my wonderful mbox files into Apples shitty undocumented format. Im now in the process of undoing the damage. I tried an emlx-to-mbox converter program, but it has bugs that ruin certain mail messages and corrupt the resulting mbox file. (Specifically, mail messages that contain a line that starts with the word from.) Perhaps JWZs emlx.pl script will fare better. JWZ knows mail.
I defintely agree w/ Mark that it was a rather bad move by Apple to implement their own, shiny and new mail-format if there would’ve been open alternatives. Nevertheless, me agreeing w/ Mark doesn’t solve the problem, but a solution to his problem is what he ultimately needs.
The solution: IMAPI honestly know only a few people regularly using IMAP to manage their mail accounts, mainly because it’s slower than POP3 and – as I’m told – is more error prone, but I use it for some years now and never had any serious issues w/ it. I even use it to reliably access my corporate mailbox, which is served by an Exchange server.
And I think that IMAP might be a long-term solution for Mark. Actually, I think an IMAP Mailserver might be the solution here:
IMAP lets you manage and keep a hierarchy of folders on your mailserver
An IMAP enabled mailserver, by definition, is a central storage for everything related to these folders and their content
You can easily find open IMAP mailserver implementations that will run on a variety of platforms, including Mac OS X and Linux * An (open) mailserver will most probaly store your mail in mbox or maildir format, therefore you always have the data available in a known and well documented format
Combining these leads to the proposed solution here:
Install a local IMAP server on your computer3
Configure your mail client for this IMAP server
Move all your mail to that IMAP server * Do what you like w/ the raw data stored by the server and don’t bother about your mail cleint’s local storage anymore
This is actually someting I might do myself, even if I don’t plan the switch away. In my opinion, it would add much flexibility at the moderate price of doubled space requirements and maybe some performance issues on slow connections.
iPhoto & iTunesBefore I shed some light on the approach I would take here, I’d like to add one more piece to the discussion: As Norman Walsh noted today inf [“XFC: bah, humbug!”][7], one of the flagship open source softwares, [The GIMP][8], actually uses a file format that is undocumented, except for the implementation in GIMP itself.
I agree that the fact that you can read the source code to work w/ the data stored by GIMP basically makes this beast a little bit less dangerous than, let’s say, Mail.app with emlx. Nevertheless this shows that even presumably open software can bear the same risk of tie-in to a particular software or data format. The fact that XFC isn’t documented implies that there is no powerful, “open” picture format that can handle layers and more consistently: As you can read in the Wikipedia entry for [XFC][9], other programs only hardly, if ever, read and write XFC. The obvious alternative, the PSD format from Adobe is most proprietary as well and therefore you’re left with two proprietary formats to store complex pictures.
The problem with the meta dataThe main problem with the iApps (iPhoto and iTunes) for Mark is not the fact that both use a proprietary file format – actually, the underlying file formats are very open ones. The problem here is the meta data: All these tags and keywords and ratings added to these files and stored in proprietary libraries are lost if you ever decide to move them out of the application.
The solution for the meta dataThe solution here is more complex and by no means complete, let alone implemented already. But I think it should be doable.
If you have a look at the different applications or plugins for iPhoto and iTunes, you quickly see that you can extract the meta data from both applications by one or more means.
And if you have a look at the common file formats used for images and tunes, you can also see that they provide different means of embedding meta data into these files:
EXIF, ICPT or XMPP meta data for files * Various ID3 tag versions for music, be it MP3, ACC or AIFF, if needed
This basically leads to the possibility of writing some application or plugin that constantly updates the source files with the meta data you assign through the UI.
For example, [Norman Walsh][10] has this nifty tool that will write RDF data into your JPEG files and with XMPP, Adobe created an extensible and open framework that lets you assign whatever meta data to your images.
And it should also be possible to assign your music’s meta data to the files, all that in some open and well documented form.
The downsideThe downside here is obviously the fact that an easy way to implement these solutions doesn’t exist. You’d have to write your own plugins or hope that someone else will implement it, and finally, except for maybe XMPP, none of the suggested ways to save your meta data in your files is really “standards” as of today.
ConclusionI think that the suggested solutions might help to reduce the problems Mark had or will have during his switch. However, I must admit that especially the latter one is very vague and definitely not something you would implement in half a day or less.
With my Exchange2iCal project having stalled for a while, I might attack these two just to have some fun and occasionally produce some piece of software that adds real benefit to the users of iTunes or iPhoto.
[1]: http://diveintomark.org/archives/2006/06/02/when-the-bough-breaks “When the bough breaks”[2]: http://diveintomark.org/archives/2006/06/16/juggling-oranges “Juggling Oranges”[3]: http://www.tbray.org/ongoing/When/200x/2006/06/15/Switch-From-Mac “Time to Switch?"[4]: http://daringfireball.net/2006/06/and_oranges “And Oranges”[5]: http://ithink.ch/blog/ “Oelbaum’s delirium”[6]: http://fink.sf.net “The Fink Project”[7]: http://norman.walsh.name/2006/06/18/xcfBahHumbug “XFC: bah, humbug!"[8]: http://gimp.org “The Gimp”[9]: http://en.wikipedia.org/wiki/XCF “XFC, Wikipedia”[10]: http://norman.walsh.name/ “Norman Walsh”
And not one more … ↩︎
I know that I tend to present “solutions” here that are technical by all means, and do not take into account if they could be implemented with ease for everyday use. I also almost never discuss the “soft factors”, mainly because I think these should be discussed individually and personally. And finally, this is my blog, so I decide how and why I write. :) ↩︎
Optionally, you can use your ISP’s mailserver, especially if you have broadband and access to the shell, providing the same benefits as you have w/ a local IMAP server, but with the added benefit of having the data available everywhere – and maybe the problem that you don’t want all that mail being accessible ↩︎