Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Community Voices

Dr. Dobb's Bloggers

Drowning in Data

October 22, 2008

My partner Nancy had been having chronic, hard-to-diagnose problems with her computer and I was getting tired of looking over her shoulder or bumping her out of her chair to troubleshoot her machine.


At the same time, we were about a month away from buying me a new portable, at which time Nancy would inherit my desktop computer and her computer would move across the driveway to the house for me to use when I needed to get away from the distractions of the office and get some work done. So I thought, why not accelerate the process and switch computers now? Then I could troubleshoot Nancy's computer's problems on my desk and she could have a new(er) computer right away.


I created a user account for me on Nancy's machine and for her on my machine, and schlepped files over to these new accounts via our file server. That way if I missed moving something over, Nancy could always bump me off her old machine and log in as herself and still have her familiar setup and all her files. And eventually I'd get rid of the redundant files and, finally, the old user accounts.

None of that was particularly smart, and it got worse when the portable arrived and I set it up.

I'm pretty fanatical about filing. I name files to indicate the project and date they are part of, I create folders/directories for all projects and parts thereof, and keep all my research materials there. But I was doing this computer swap in free moments while trying to meet deadlines and deal with the usual daily emergencies, so the upshot is that I now have nice clearly labeled folders of nice clearly labeled files on my portable, in my account on either of two desktop machines, and on the file server. Many are duplicates, and all I need to do is reconcile these. One of these days.


It's a mess.


So I guess the moral of this story is, never hire me to manage your files.

But another moral is, even when you're just managing megabytes of data, things can get out of control quickly. 


For dealing with petabytes, you need an entirely different approach.

Who's dealing with that much data? Soon, maybe all of us. When I checked the result of a recent Slashdot poll at 55832 votes, the most popular answer to the question, "How much storage will you be using ten years from now?" was 100TB - 1 PB, closely followed by "Depends... how much you got?"


At some point the file systems we've used for so long won't measure up to the demands of all this data. Apple is building read and write support for Sun's 128-bit ZFS file system into its Snow Leopard release of OS X Server. ZFS once stood for Zettabyte File System, but now it's an orphan acronym, like IBM. IDG (which I think is still a regular acronym) predicts that digital data will be being generated at a rate of one zettabyte (1000 exabytes, 1,000,000 petabytes) per year by 2010.


In the shorter term, there are some people confronting the challenges of massive storage right now. There will be a competition at the SC08 supercomputing conference (November 15-21 in Austin, TX) showcasing approaches to making the best use of storage in high performance computing. (www.sc08.com) 


But in the meantime, I think I just have to roll up my sleeves and clean up my filing mess.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.