Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Channels ▼

Walter Bright

Dr. Dobb's Bloggers

Pack Rat is Liberated

April 27, 2008

I admit it. I'm a pack rat. I can't throw anything away, even if I'm certain I'll never use something again. I envy people who are able to clear out their garages and get rid of stuff.

But I have made some progress. My crates of 5.25" floppies are now all copied to CD (good thing I did that a few years ago, a high percentage of them are unreadable now).I've got (almost) all of my music now on a hard disk, where it takes up very little space and is instantly available, including my disco record collection (do I really have one of those?). I spent many hours scanning in the photographs, so I have them on disk, too.

But one thing eluded me, all those heaps of papers. I have 30 years worth of papers. Documents, magazine clippings, research papers, manuals for obsolete software, conference proceedings, specifications, standards documents, ... Piles of them, in files, boxes, cabinets, and tubs. Some of it is wrapped in plastic because it got damp once and mildewed. I've gone through them many times, trying unsuccessfully to bring myself to toss them out. I thought if this stuff was on the internet, I wouldn't feel compelled to keep it. But it's mostly pre-internet stuff, and Google cannot find it.

I tried scanning in some of the papers, but it was an agonizing process. It takes about a minute a page to scan, and you have to babysit the machine loading in each paper page by page, hoping the pdf converter won't crash this time and forcing a do-over on a long document. I gave up.

The piles grew higher.

Last week, being bored and uninspired, I tooled around on the internet looking at scanners. I found one that has, by gawd, a hopper on it. It said it would scan 50 pages at a time, both sides at once, a few seconds a page. This was too good to be true. I read every review on it, and nearly all were very positive. The price was good, too, about $830. The only downside was it didn't come with OCR software (to turn images into text). Arggh. But it sounded worth a try, and I could buy some OCR software for $200.

I waited a week for the "3 day" shipping to deliver it. It was smaller than I expected; it didn't look like it could possibly work. I struggled for an hour installing all the software that came with it (it turns out that most of it was just crapware that clotted up the arteries on my computer and did nothing useful). One of the items was OCR software, that unfortunately said it would run for just 30 conversions and then ding me for $400 for the "unlocked" version. No thanks.

The reviews, product poop sheet, and documentation all said there was no OCR software included (except for the trial version). But I poked around on the menus, and buried in one was "create editable document." I selected that, scanned a sheet, and sure enough, it OCR'd it just dandy (it even has an option to recognize the special german characters). Go figure, the people who stuck this together didn't even know what their own software does.

The gizmo works! I load a wad of papers in the hopper, and zip zap zup it reads them about 3 seconds for both sides. (The OCR pass takes a little longer.) So far, I've stuffed about a 6 linear feet of papers into its maw. They're all going into the recycling bin! Yay! All that crapola fits onto one CD-ROM. Of course, I haven't made much of a dent in the boxes of papers, and I'll never get them all scanned in. But it's going to be a big help. Best of all, with them on a hard disk they are  searchable, and always instantly available. Now, if I ever need to look up the LIM Expanded Memory Specification, I know just where to find it.

It's too bad the scanner won't work for books unless the backs are sliced off. But at
least the books look nice on the shelf, unlike those nasty boxes of papers.

Now, if I can only figure out what to do with those piles of CD-ROMs...


Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.