Drive Salvage

Triple-boot systems and memory-stick errors are the least of Ed's problems.


June 01, 2006
URL:http://www.drdobbs.com/embedded-systems/drive-salvage/188700845

Ed is an EE, a PE, and author in Poughkeepssie, NY. Contact him at [email protected] with "Dr Dobb's" in the subject to avoid spam filters.


One of my professors observed that EEs, regardless of whether they specialize in microcircuits or power generation, inevitably have relatives who ask them to fix televisions, so we might as well get some practical knowledge to show we hadn't wasted the last four years. He then devoted a spring-semester week to explaining how to diagnose and fix typical TV problems, a pleasant change from the usual theoretical rigor.

With that lesson in mind, here are some handy hard-drive rescue techniques. I'll start with a motivational tale, show how to impress your friends, then demonstrate why this is more relevant to embedded systems work than you might think.

Simple Recovery

Although I run SuSE Linux on my desktop systems, I still maintain two Windows boxes. One, a triple-boot desktop, normally runs a Debian-based distro with the RTAI real-time kernel for my CNC milling machine. During tax season, it boots Windows XP for TurboTax and streams Internet radio into my basement shop with SuSE Linux.

One day, the CNC controller didn't start up correctly. Linux records nearly every event somewhere and it seemed several sectors had gone bad on that ancient 40-GB drive. Yes, three operating systems can fit in 40 GB.

A separate file server provides NFS and SAMBA shares for all our data files, so the desktop boxes don't store any vital data and all systems have access to the same files. Backup is fairly simple, with not much to save beyond /etc and /home. Reinstalling Windows and its programs, though, remains a chore.

Rather than fiddle with the details, I simply store image copies of the disk partitions on the file server and update them every now and then. Restoring a drive requires nothing more than recreating the partitions and copying the images. Although there are many Windows hard-drive utilities available, the Linux alternatives provide fine-grained control, plus they're both free as in beer and free as in speech.

Many of the instructions for full-drive restoration assume you'll restore the image to an identical drive, but that's applicable only when you buy drives in bulk. The last time I checked, no two drives in my heap were identical. I swapped in a slightly newer 80-GB drive and started the restoration process by booting SystemRescueCd.

SystemRescueCd

SystemRescueCd (SRCd), available at www.sysresccd.org, is a bootable CD with the tools required to diagnose, back up, and restore a system. I've also put SRCd on a 256-MB bootable USB stick.

You can use the fb1280 kernel on any system with a decent display and, because most PCs have enough memory for the entire 100-MB CD image, you can also specify cdcache to speed things up. SRCd starts off with a garish character-mode display, but on the other hand, it's not particularly fussy about the underlying hardware, which is precisely what you want in a rescue program. With SRCd booted, net-setup eth0 configures nearly any NIC for either DHCP or a specific IP address. Run ifconfig to verify that it's set up the way you want, ping another box to be sure you have the network cable plugged in, then run /etc/init.d/nfs start to enable remote mounting.

You probably already have a PC set up with an NFS- or SMB-shared hard drive, and if not, you should. Listing One shows what you need to mount an NFS drive at /mnt/temp1 in the SRCd filesystem. Actually setting up the server is straightforward and well documented, at least after you understand UNIX-style file permissions.

run_qtparted                # create the partitions on the drive
net-setup eth0               # define IP address and suchlike
/etc/init.d/nfs start         # also fires up portmapper, lockd, etc
mount 192.168.1.2:/bulkdata /mnt/temp1        # mount the NFS share from the server
cd /mnt/bulkdata/Backups/Dim4550              # directory for these partition images
partimage restore /dev/hda1 WinXP-hda1.000 # restore the first partition
Listing One

Creating partitions on a drive is straightforward: run_qtparted starts a graphical partitioning program. I prepared an NTFS partition for Windows XP, a swap partition, an ext2 partition for Debian, and an ext3 partition for SuSE that filled the rest of the drive because I planned to have our daughter reinstall SuSE from scratch.

The partimage program saves and restores partition image files with optional compression to reduce the file size. It can use files on a different hard drive or partition, run in client-server mode with another partimage on the machine with the image files, or restore directly from an image file on CD or DVD. Using an NFS share works nicely with my setup.

If you want to restore a partition's original contents into a larger partition, create the new partition at its original size, restore the image file, then resize the partition with QTPartEd. Yes, it works with NTFS.

Filling a partition's unused space with binary zeros dramatically reduces the space required for a compressed image file. Mount the partition (mount /dev/hda1 /mnt/part), create the file (cp /dev/zero /mnt/part/dummy), delete it (rm /mnt/part/dummy), then proceed with the partimage backup.

Obviously, this level of detailed control doesn't scale well to large installations, but overall it's no more effort for me than commercial software. The benefit comes from knowing how things work, which pays off when the support-line phone rings.

Filling Disk Divots

A friend called in despair when her old Sony Vaio laptop wouldn't boot Windows ME. A neighbor had diagnosed the problem as a dead hard drive, but she was looking for a second opinion. She's retired and uses the thing just for e-mail, so dropping a few hundred bucks on a repair—or worse, a new system—wasn't in the cards. Reinstalling Windows and all the programs wasn't an option, either, for all the usual reasons.

I found that the laptop started to boot, spat out a few useless messages, then jammed solid. The erratically varying symptoms seemed to show that Norton Internet Security was badly damaged. At least the drive wasn't quite as dead as the neighbor had led us to believe.

I treat alien Windows boxes with the same cheerful nonchalance as I might handle random lumps of high-level radioactive waste and, in particular, I don't just plug them into my LAN to see what happens. As it turns out, Linux provides the best way to safely and securely repair Windows problems.

The object of the game was to save the contents of the drive so I could perform repairs without making things worse. Most disabled PCs have older, relatively small, drives and this laptop was no exception: It had a 12-GB drive sliced into four (!) separate partitions.

I backed up the partitions with partimage to ensure that, if all else failed, I could restore the drive to its original configuration. However, partimage stores image files in a format that's suited for backups, not repair work, so I needed another trick.

The dd utility (an abbreviation for either "data dump" or "disk destroyer," depending on whether you've ever swapped its if= and of= operands) makes an exact byte-for-byte copy of its input source at its output. Listing Two(a) shows how I backed up the master and partition boot records and first partition of the Vaio's drive.

  
(a)
net-setup eth0             # define IP address and suchlike
/etc/init.d/nfs start        # nfs starts up portmapper, lockd, etc
mount 192.168.1.2:/bulkdata /mnt/temp1         # mount the NFS share
mkdir "/mnt/temp1/Backups/Sony Vaio"          # set up a directory to hold images
cd "/mnt/temp1/Backups/Sony Vaio"             # save some typing
dd bs=512 count=1 if=/dev/hda of=drive.mbr # save drive's master boot record
dd bs=512 count=1 if=/dev/hda1 of=hda1.mbr # save partition 1 boot record
dd if=/dev/hda1 of=hda1.bin               # save wrecked Windows partition

(b)
mount -o loop "/mnt/bulkdata/Backups/Sony Vaio/hda1.bin" /mnt/loop
fsck -vV /dev/loop0          # fsck uses underlying loopback pseudo-device
ll /mnt/loop                 # the mounted filesystem looks like any other
Listing Two

A failing drive will return errors at bad sectors and fill those parts of the image file with invalid data. Restoring a damaged image to a new drive won't magically recreate the original bytes, so the best I could do was remove the damaged files.

After creating the backup images, I returned to my desktop system and mounted them using the Linux loopback pseudodevice to see what was broken. Listing Two(b) shows how that works for the first partition.

Loopback lets you manipulate the partition image with the same tools that work on an actual disk partition. Filesystem utilities such as fsck use the underlying pseudodevice, typically /dev/loop0 for the first file you mount this way. File-handling utilities, browsers, and suchlike refer to the mount point, which is /mnt/loop in Listing Two(b).

I managed to repair the filesystem errors by deleting the affected files, emptying the various directories that contain autostarting programs, and generally trimming the underbrush. The ClamAV virus scanner (www .clamav.net/) reported no hits, a pleasant surprise on a Windows box. All this fiddling ran much faster on the image than directly on the laptop, too.

Booting SRCd on the laptop again, I ran fsck on the drive to mark the bad sectors, deleted everything except the boot files required to start WinME, then used cp -au to copy everything from the repaired partition image back to the drive. You may not be able to use dd to restore an image file to a repaired partition, because it must maintain sector-by-sector alignment during the copy to keep the filesystem structures intact. Modern desktop drives can transparently remap failed sectors below the level of the filesystem, but that trick was beyond this drive's capabilities.

There's no good way to edit the Windows Registry using Linux, so I eventually had to boot WinME in Safe Mode on the laptop, uninstall several programs, then blowtorch a few recalcitrant programs with regedit. After several iterations of this, the laptop finally booted cleanly and ran well. My friend said that she didn't use the missing programs.

I transferred the compressed partition images to DVDs so I can restore the system with somewhat less effort when that drive finally fails completely. Regular backups would be nice, but it's out of my hands.

Speaking of images, the command dd if=/dev/cdrom of=image.iso makes an image backup of a CD. You can then burn the image to a CD-R to make a duplicate or mount it using loopback to read its contents, subject to the usual copyright and EULA issues.

After you're up to speed with PC restoration, try your hand at some embedded systems work.

Camera Crash

My digital camera crashed while taking pictures at the annual Family Christmas Gathering. Unlike the episode I described in the December 2004 column, the camera popped up a MEMORY STICK ERROR after snapping the 44th picture of the day. I inserted another Memory Stick and continued the mission.

Back home, I inserted the crashed Memory Stick into a USB card reader and found that, while it was recognized as a USB Mass Storage device, it had no partition table. I used dd to copy the entire 128-MB Memory Stick to an image file.

Browsing through that file with khexedit showed that the camera had zeroed much of the partition table and trashed some of the FAT area, but left the JPG files intact. The block of binary zeros at the start of the file evidently confused the grep text-searching utility into returning offsets that didn't match up at all with the values shown by khexedit.

Running the file through strings -t x MemStick.bin | grep Exif returned the correct offsets. As before, I imported those lines into a spreadsheet, rounded the values down to multiples of the 512-byte sector size, produced dd commands that extracted 3-MB hunks of Memstick.bin into separate JPG files, then fed them to ImageMagick's convert utility to extract the JPG images.

The first 43 images were perfect, the bottom of the 44th image was missing, and 8 deleted images from a previous session also appeared. The fact that deleting a file doesn't actually remove its data sometimes comes back to haunt folks with something to hide, but in this case it was okay.

Those restored pictures went out on a CD of the Gathering and all was well.

Continuing Education

Any embedded system large enough to require an operating system also uses a filesystem to organize external storage. Smaller, data-intensive systems without a formal OS, such as a single-chip MP3 player with a voice recorder, may write sound files into the same large Flash memory used for read-only music. Even deeply embedded systems without removable storage generally have a filesystem of some sort on their Flash program memory.

Below the level of the filesystem, partitions divide storage into chunks that resemble disk drives. For example, the cfdisk utility lists nearly 100 different partition types from Amoeba through XENIX, some of which you'll actually find in contemporary systems.

Understanding how all this works may not be on your to-do list, but knowing how to rebuild a filesystem or recover critical data from a crashed Flash device may well come in very, very handy. With the Linux kernel worming its way into embedded systems, knowing how to drive UNIX-oid tools and work with Windows filesystems can also be valuable. Just start with SRCd and work your way up.

Last Tab

I have outlived four different commercial backup products that trapped my data in obsolete, unreadable files on media that are no longer supported. Perhaps I'm a slow learner, but it seems that storing backups in a known file format on standard media has a lot to recommend it.

If you plan to (or must) do this stuff more than occasionally, get a two-port KVM switch. Plug your favorite keyboard, video monitor, and mouse into the switch, then plug your usual desktop PC into one port. When it's time to rescue a PC, plug it into the other port and switch back and forth between PCs without cluttering your (physical) desktop. Search for F1DK102P at www.belkin.com for one example of many.

When hard drives cost $0.30 per gigabyte it's hard to justify not having enough storage. Pick up a huge hard drive for the price of two tanks of gasoline, install it in your desktop box, set up a file share, and be prepared.

Yes, I've tried Wine and Win4Lin and, no, neither really gets the job done, so I just boot Windows on a separate box, with TightVNC (www.tightvnc.com) relaying the session directly to my comfy chair. It's not clear that the Windows EULA actually permits this, though, when you read the fine print.

Speaking of EULAs, it's also not clear that you may make more than one backup copy of a Windows partition or create multiple application backups. Restoring the images to a different drive or another partition will almost certainly clobber any DRM-encrypted music you've been foolish enough to purchase.

More info on the EMC project that drives my milling machine is at www.linuxcnc.org. Novell's SuSE is at www.novell.com/linux. Knoppix provides the same rescue features as SRCd, along with everything else in a full Linux distro, at www.knoppix.net. Get more partimage info at www .partimage.org.

Speaking of Windows ME, alert reader Trevor Harmon reminds me that it still allowed direct, user-mode I/O port control. Read more on FAT and the Microsoft patents at en.wikipedia.org/wiki/File_Allocation_Table.

When other drive recovery methods fail, you can find inspiration at www.offliners.com/movshdwreck.mov.

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.