What Does Data Portability Mean To You?

There has been a lot of news the past few days about data portability. I posted a summary and analysis of some of the news a few days ago. Today, the Data Portability Working Group announced that LinkedIn, Flickr, SixApart (TypePad and MovableType) and Twitter have joined as well.

So, what does all of this mean to you? Right now, it means nothing. What will it mean? Well, it is not entirely obvious from the main page of the data portability website. Here is their mission:

To put all existing technologies and initiatives in context to create a reference design for end-to-end Data Portability. To promote that design to the developer, vendor and end-user community.

This fails the rule that you can not use the same term to define the term. The tagline “Sharing is Caring” does not mean a lot in this context either. The main image on the site has a “formula” for what it means:

Existing Technologies + Turnkey Reference Design + Simple User Story

Yep, I still don’t get it. Well, there is a list of technologies like openID, RDF, RSS and a few others. So, we are looking at some existing open technologies for common identification and data transfer. There are two specifications listed for the reference design, the Technical Reference and the Policy Reference. The policy reference is referred to as a “boilerplate terms and conditions”, but has not been detailed yet.

Disappointingly, the technical reference is still just in outline form so we need to make a few assumptions. First the “Identity” section outlines the use of openID (and similar technologies) for login, discovery, retrieval and update/sync. For modeling the data on the web, they describe a file system model called the Web Relational File System (WRFS) that is functionally layered like the OSI network model. The basic idea is that all URIs are addressable resources (called web inodes) that are available over http. This should be OK for techies so far. There are also permissions, “opt-in” policies, pointers to other inodes, etc. This should be familiar to anyone using the web as permissions, registration and links. There is nothing earth shattering here either. It also looks like there will be data query and retrieval specifications as well. There are also a few links to more detail on Identity and WRFS.

Based on this information, it seems as if the data portability specifications are still in their infancy. They have the basics that would be needed to start programming specifications and APIs for identification and querying. However that part of the specification is not that difficult. The important part of the specification will be standard data formats and communication formats. Given that openID, opml, rss, rdf and several other specifications are being included there may be hope of consistent data reading and transfer. However, this was already defined with openID and rss. What is missing? There is no information on who owns the data, whether the user can export and import data from a particular service. THAT IS PORTABILITY! Most of the users on the internet do not care about openID and rss. Users want to make sure that when the time comes, they can take their data from Facebook and import it into Plaxo without getting banned from one of the services.