
I feel that right now Web 2.0 service providers are operating like those car manufacturers before the shift to car safety.
Ever since the infant days of the internet, people have been putting more and more data online (emails, newsgroup posts and IRC conversations) without giving it any thought. But in the Web 2.0 age we are leaving behind a trail of data much more personal than ever before. We tell people about our lives, our thoughts, and where we've been with our photos (some even tagged with geo-data).
How do we know our data is safe, not just from prying eyes but also from accidents and disasters? Can we just trust the brand these application providers (e.g. Blogger, Flickr, Vox, etc.) to know how to execute a well-planned data retention plan? Just take a look at the recent Blackberry outage. Even RIM, one of the most trusted service providers for business in most people's eyes -- at least until last week -- suffered from service outage seemingly from an innocent upgrade in the caching mechanism. No data was lost but this type of outage can happen to anyone.
How would you feel if tomorrow your blog host lost your data permanently and two years of your online life is gone? How would you quantify this kind of lost in monetary terms, because free service or refund is all your blog host can provide as compensation?
Data lost on that scale, I believe, is a time bomb waiting to happen. It is not a matter of if but when. In the meantime, most service providers are not providing users a way to archive their data in any straightforward manner by themselves. Most blog systems have facility to export all the blog posts -- at least in some form -- but some such as Vox, currently provides no data backup function.
Some service providers are filling this gap, not by design but more like by chance. Being able to import blog posts/content via RSS feed is a recent feature that is gaining momentum with some services. Tumblr and Jaiku allow users to import data from other sources such as blogs and photo sites into Tumblr or Jaiku. This is designed to allow users to integrate their data much more effectively without a lot of jumping back and forth.
But this concept can be leveraged to help safeguard our online data. An application or online service could potentially pull all of your online data into a single repository, a la Yahoo Pipes style, and store it in one of the RSS formats. This repository could be used online for content aggregation and offline for safekeeping. If and when disaster strikes, users can easily 'import' their data back onto all the different services without breaking a sweat. No more concerns with data retention, or lack thereof.
Another way to seamlessly provide data backup would be to offer CD/DVD archives as an option, such as creating a disc from your online photo collection. Then maybe this could be tied to the annual renewal event so every year when the service is renewed, an archive disc for the past year would be sent to the user as part of the deal or as an incentive.
Why is no one providing these types of service right now? If they are out there, why are these services not gaining the attention they deserve? Is our online data worth so little to us that we are not willing to pay a little extra for physical backups and a little peace of mind?













