Friday, November 03, 2006

Storing Your Data On Somebody Else's Machine

This posting is more of a question than a comment. Over the last several years, there has been a surge in various web-based services that encourage you to store your data and documents on machines that you don't control. The winner in this category is, of course, Google, who provides the world with Gmail, Google Calendar, Google Documents & Spreadsheets, Google Base, Google Notebook, Picasa and most recently JotSpot. There are other companies that provide similar services, such as Yahoo, Hotmail, Flickr, etc. These services are very attractive because of factors such as
  • large (or even unlimited) disk storage
  • universal accessibility (i.e., you can get to your data from any machine you would like)
  • reliable data backup and other maintenance
  • highly functional and constantly improving software
All of these services seem to make their money by showing you advertising along with your data. That's OK, I guess. They all seem to provide decent password-based security so that only you and those you specify can access your data and documents. The main drawback to such services is simply that you are depending on somebody else to take good care of your data even though they have no direct financial or legal incentive to do so. Also, in nearly every case, there is no easy way to get your data out of these systems. You are pretty much locked in unless you do some clever hacking.

This last point would seem to be an especially serious obstacle, only it's not, at least for me and millions of others. I've thought about it a lot of times, but I keep making the choice to store my email, my files, my data, etc. on these services because of the factors listed above.

What do you think about this situation? What do you do?

2 comments:

Anonymous said...

As you said, the benefits are hard to resist, and I mostly don't resist them.

Many of the drawbacks could in theory be answered by something like a network-distibuted RAID system. Your data resides spread across many sites on the Internet, and is furthermore redundant. Therefore:

a) no one site controlls your data
b) if one (or more?) sites go down, you still have full access to your data.

I've seen schemes like this proposed, but to my knowledge, the full-blown concept has never been impmlemented. It would essentially require voluntary donation of disk-space (and bandwidth) of many individual users, I guess.

Wile E Quixote said...

It's like a lot of good ideas (such as natural-gas-powered vehicles): the main obstacle is the infrastructure required to get started. Admittedly, there are the SETI At Home Project and various peer-to-peer networks to use as examples. Also, the system would have to include secure encryption so that the various distributed fragments would be useless to anyone but their owner.
--Bill