As a sysadmin, I wear many hats. Some days I'm the janitor--I clean up discarded files on the file server and clear spam from the mail server. Other days I'm the maintenance man--I make sure all the servers are running smoothly and that any holes have been patched. Some days I'm the architect--I plan, organize, and design systems to suit our needs. Some of my favorite days, however, are the days I put on my rescue hat. When a machine is in trouble, the whistle sounds, I grab my rescue gear, and I run down the beach with my life preserver. OK, well, I made that last part up; I'm not David Hasselhoff and this isn't Baywatch, but when it comes to system recovery, I choose the other thing Germans love--Knoppix.
As a battle-hardened sysadmin, I've seen a lot of broken systems (some I broke, and some were broken for me). I've carried a number of rescue disks, including tomsrtbt and the LinuxCare Bootable Business Card, but over the past year or two, I've started to rely completely on Knoppix as an all-in-one rescue disk. Below are some real-life accounts of how I've saved some broken systems with just my Knoppix CD.
The first and only time I experimented with out-of-spec IDE cables was on my main workstation. The system was housed in a huge full-tower case with a motherboard that strategically placed the IDE connectors at the very bottom. The system had a lot of drives in it, and the only way I was able to connect the drives at the top of the case was to use IDE cables that were a few inches out of spec.
At first, everything seemed to work well; however, after some time and some heavy hard disk load, I noticed a few filesystem errors that culminated in not being able to mount the XFS root filesystem. Knoppix has had XFS support, including the complete set of XFS tools, for a long time, so I booted my Knoppix CD and was able to use its copy of
xfs_repair to repair the damage with minimal data loss on the drive. I could then boot the system without having to reinstall (and subsequently replaced those IDE cables).
One of my favorite stories of Knoppix recovery started when I was trying to reinstall grub on my laptop after moving around and resizing some partitions. The
grub-install script didn't seem to work, so I went through the documentation to install grub to the MBR (Master Boot Record) using
dd. What seemed like a good idea at the time was to follow grub's instructions for creating a boot floppy, only applying it to my hard drive with the command
dd if=/usr/lib/grub/i386-pc/stage1 of=/dev/hda bs=512 count=1. In a way the command worked, in that it did copy grub to the boot code in my MBR (the first 446 bytes of /dev/hda), but it also copied over my partition table (the last 66 bytes of /dev/hda). The result was that grub started to load but was confused because it could find no partitions on the drive.
Before I gave up and reinstalled Linux over the top of my customized and fully configured install, I decided to research tools that might be able to find my partition. What I found was a great tool called gpart (short for guess partition). Gpart scans through a drive looking for partition signatures and pieces together a partition table based on what it finds. It just so happened that my Knoppix CD already had gpart installed, so I was ready to get to work.
Gpart works in a pretty straightforward way: you run gpart with a drive as an argument, and it will scan the drive and show you the partitions it finds. When you are ready to write the changes, you run gpart with the
-W option, specifying which drive to write the partition table to. For my system, I ran
gpart -W /dev/hda /dev/hda. From the output, it appeared that gpart was able to piece together my partition table for me. Once I wrote the changes, I rebooted back into the Knoppix CD, mounted my Linux filesystem, and saw that all of my files were still there!
All that was left was to do what I set out to do at the beginning--install grub. Since Knoppix included grub-install, I was able to install grub from within Knoppix by mounting my root directory read/write and running
grub-install root-directory=/dev/hda3 /dev/hda. After another reboot, I was reunited with my grub boot menu and was able to boot back into my Linux system.
I suppose the moral of this story is to be careful when you play around with the
dd command and your MBR, but the secondary moral is that if you aren't careful when you play around with the
dd command and your MBR, Knoppix has the tools to help repair your damage. It has been a lifesaver for many of my Linux and Windows systems alike. The next time you have to put on your rescue hat, I recommend giving Knoppix a try.
Kyle Rankin is a system administrator who enjoys troubleshooting, problem solving, and system recovery. He is also the author of Knoppix Hacks, Knoppix Pocket Reference, Linux Multimedia Hacks, and Ubuntu Hacks for O'Reilly Media.
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.