Copyright © 2007 released under GPL2. Will change to GPL3 when GPL3 is released. Joseph Mack.
How I recovered a w2k disk with an NTFS filesystem, after Partition Magic (R) froze during a resize operation. Fun with Partition Magic v7 and v8.
Table of Contents
In Unix, you can make a bootable copy of your disk by copying (or updating) all your files to a target disk. I do this once a week. I also backup all new files, at the end of a work session, to another disk; e.g. all successful compiles result in a copy of the files going to a backup disk. For Windows, a cron job backs up the text files in \Documents and Settings every hour to the Samba server. Since disks are unreliable, you need a way of continuing work with minimal interruption, when the guaranteed disk crash arrives. My way is to pick the backup disk off the shelf, boot and copy over the backups since the last disk copy.
Microsoft doesn't have any way to do this; copy and xcopy don't copy all the files because of 8.3 filename incompatibilities and certainly don't produce a bootable disk.
To produce a backup bootable Windows disk, the best solution I've found starts with repeated alternating rounds of defrag'ing the disk (in safe mode, which I believe works better because fewer driver files are open, allowing them to be moved), and filling the disk with nulls using nullfile from Matthias Jordan. Running nullfile has two effects
On convergeance (i.e. when the defragger has no further effect on the filesystem, about 4 iterations gives you all you're going to get) I then dd copy the windows disk to a file on my backup disk with a command like
dd if=/dev/sda of=/backup_dir/machine_name.os_name.original_disk_name.date
(the output filename might be lumpy.w2k.hda.200706)
Since dd is a Unix utility you need the Windows disk (unmounted) in a Unix machine. There's a couple of ways of doing this.
On a dual boot machine, I boot to Linux and dd the windows partition. If you have a dual boot machine, you also need to copy the partition table with
dd if=/dev/sda of=/backup_directory/mbr.machine_name.os_name.date bs=512 count=1
|You do have a backup of the mbr's for all your machines don't you?|
If your disk is Windows only and you copy the whole disk, then you already have the partition table info.
I then compress the Windows drive, making an md5sum as well (don't know that I need this, but what the hell).
beaver:/backup_dir# md5sum lumpy.w2k.hda.200706 > lumpy.w2k.hda.200706.md5sum; touch -r lumpy.w2k.hda.200706 lumpy.w2k.hda.200706.md5sum beaver:/backup_dir# gzip lumpy.w2d.hda.200706
None of this counts until you've reversed the process onto another disk and shown that it behaves like your original disk. Since you're doing a dd onto the target disk, the target disk must have no badblocks - run badblocks on it to be sure. You can check that the backup disk is sensible by mounting it using the offset option of mount (see Mounting disks with Linux's loopback device for details; also google for "Disk images under Linux"). Mount deals with partitions, not files or whole disks. For a standard drive of 63 sectors per track and 512 bytes per sector, the mount command is
beaver:/backup_dir/# mount -o loop,offset=32256 -t ntfs lumpy.w2k.hda.200706 /mnt or beaver:/backup_dir/# mount -o loop,offset=32256 -t ntfs /dev/sda /mnt
You can then look for files and directories. Surprisingly to me, you can see all the directories within a few seconds of starting the dd copy. All the directory information must be at the start of the disk.
One of the problems with nullfile is that it only generates 4G files (this may be a 32-bit Windows problem). If you have 20G of unused space on the disk, you have to start nullfile 5 times. With 4 rounds of defragging, the nullfile runs take a while and are a pain to start off manually (I guess I could run a .bat file, but I hadn't thought of that - I try to avoid Windows as much as possible). As well, 20G of null space on a disk takes time to copy and compress and I'd like to reduce the size of the unused part of the disk. What I've been doing is to resize the partition with Partition Magic (and then defrag/nullfile). This has been working nicely for several years.
I resized the disk so that the unused space would be less than 4G. The resize froze in the middle.
That was it.
After checking that the numbers on the screen weren't changing for 10mins and the mouse was no longer responding, I switched off the machine. On rebooting I got the "INACCESSIBLE_BOOT_DEVICE" error. I've got this error on moving a working disk from one machine to another; the disk boots fine on returning it to the original machine. Microsoft's web page on this error seems designed to make sure no-one ever recovers from it.
Miles Comer's NTLDR is missing didn't help either.
On looking with Partition Magic, I saw strange numbers and descriptors for the disk. PM's partition info said
"Critical 19 At end of something"
Useful diagnostic huh?! Anyone working for me would be out the door in record time for that.
I was running Partition Magic v7. There was no documentation for Critical Error 19. In fact there was no Critical Error 19 at all - it didn't exist, (at least in the minds of the Partition Magic people). There was nothing on Critical Error 19 with Google; amazingly I was the first and only one to see it.
Apparently PM rearranges the disk without any journalling, so that a crash leaves the disk in a deranged state. This would be OK for a beta version sent to the QA guys, before market testing, but not for a released version and certainly not for v8. You should be able to pull the power chord on a resize and have the disk in a state where the resize can be recovered. Let's hope these people aren't writing software for the medical devices, air bags or the Space Shuttle.
What about my backups, you say? A reasonable question. This machine was rarely used and the user said that there was nothing new installed since the last backup. After the crash, he remembered that there was a whole lot of important new files on the machine. From here on in I'll backup his whole \Documents and Settings tree with my cron job.
Here I made my first mistake. Thinking that the lack of documentation on Critical Error 19 was an inadvertant oversight in an otherwise competent team of programmers, which would be remedied in the next version of PM, I shelled out for PM v8.
Waiting for PM v8 gave me an opportunity to do all the things you're supposed to do when disaster strikes and when almost any move will make things worse.
The big change from v7 to v8 is (drum roll)
Yes folks, a decade of hard work went into that upgrade and is fully deserving of a change in major number. There is no change in functionality and no documentation on Critical Error 19. With proprietary software, you really do get what you pay for.
PM v8 is installable on the hard disk. There PM uses the OS's capabilities to make a nice screen and from there PM can run the OS's chkdsk and defrag (rather than it's own). You can also inspect a disk in a usb enclosure, something you can't do with the DOS version of PM. However you can't do much with a disk with open files, so the only use for PM installed on the harddrive is to look at a D: disk. If you want to do anything use with PM on C: you have to make a set of DOS rescue disks. There is no bootable PM CD although instructions to make one are available on the internet. Whether you start with a bootable CD or DOS rescue disks, you're back where you were with PM v7 and you're running under DOS (I guess PM didn't want to pay for a license to boot up under Windows).
Upgrading to v8 had got me no-where and only enriched the PM people.
Ha, ha, ha. Just kidding. (Actually I did get a useful piece of information, but you have to be persistent.)
Symantec is the distributor for PM. Symantec has a webpage with FAQs etc (none of which address likely problems) and after navigating a few screens, you get the choice of help by phone (costs money), e-mail (free, unlikely to be replied to and unlikely to address the problem) or a chat line (free). I took the chat line. The page asks you the country you're from, presumably so they know whether to talk to you about baseball, cricket or soccer and you're asked your problem. I stated that my resize had frozen, on reboot I got the "INACCESSIBLE_BOOT_DEVICE" error, from PM I'd got "Critical Error 19 At end of something" - how do I recover? Then you're put through to a person with a female Indian name who seems to be following a script ("Good afternoon Mr Mack, how may I help you today?") even though you've just detailed your problem. I mouse swiped the problem description.
On finding that my information came from the DOS rescue disk version of PM, I was told to boot from the cdrom ISO and come back again when I'd done that. I could not get the person to address the error 19 problem and my attempts to do so met with "It's been a pleasure working with you today" and disconnection. The interaction reminded me of ELIZA talking to PARRY (for a demo of ELIZA try ELIZA - a friend you could never have before).
I hadn't found an ISO on the cdrom, so I read the quick start and the full documentation from end to end, to see if I'd missed anything on the install and my problem. There is no ISO on the cdrom, but there are pages on the internet to tell you how to make one from the DOS rescue files. You wind up with the same setup as booting from the DOS floppies. I'd been sent on a fool's errand, by someone who had no intention of helping.
After acheiving peace of mind and reading all the documentation, I tried again.
This time I got someone who asked which disk I resized. C: of course.
How many people have multiple disks? I tried using D: like /usr more than a decade ago with WinNT, only to find that any programs I put on D: the OS didn't know about. Windows is a one disk OS.
I was told that of course I got the Critical Error 19. You aren't supposed to resize C: and if you do, you'll get error 19. I pointed out that this was not in the documentation, that the documentation had no restrictions on which disk you can resize and there was no documentation on error 19. I was told to send my problem and the output of partinfo to firstname.lastname@example.org. I heard nothing back in what is now several weeks.
It took a few days to acheive peace of mind and I tried a 4th time. This time seeing a pattern (they aren't going to address the problem no matter what, I decided to stick to the point of my posting, not get disctracted and record the conversation).
Shailendra(Thu May 31 2007 13:45:09 GMT+0000 (GMT))> Hello Mr._Joseph_Mack. My name is Shailendra. Shailendra(Thu May 31 2007 13:45:18 GMT+0000 (GMT))> Thank you for contacting Symantec Live Technical Support. Please make a note of the Chat Request Id  for this interaction. Shailendra(Thu May 31 2007 13:46:03 GMT+0000 (GMT))> I understand from your message that you are having issue with resizing froze and you are getting the critical error 19. Am I right? Mr._Joseph_Mack(Thu May 31 2007 20:47:55 GMT+0000 (GMT))> yes Shailendra(Thu May 31 2007 13:48:07 GMT+0000 (GMT))> What is the version of Partition Magic? Mr._Joseph_Mack(Thu May 31 2007 20:48:20 GMT+0000 (GMT))> 8 Shailendra(Thu May 31 2007 13:48:54 GMT+0000 (GMT))> Please let me know what exactly happened? Mr._Joseph_Mack(Thu May 31 2007 20:49:48 GMT+0000 (GMT))> I think my summary covered what happened Shailendra(Thu May 31 2007 13:50:42 GMT+0000 (GMT))> Please let me know have you resize the C and D drive? Mr._Joseph_Mack(Thu May 31 2007 20:52:15 GMT+0000 (GMT))> I resized C, I followed the instructions for resizing C in the documenation (make sure don't move out of the bootable area). It seems from the documentation that resizing C is a normal thing to do. analyst Shailendra has been temporarily disconnected. Please wait while the analyst attempts to reconnect. We are experiencing higher than usual service times. Please wait and an analyst will be with you shortly. analyst Shailendra has entered room. Shailendra(Thu May 31 2007 13:53:41 GMT+0000 (GMT))> Have you backed up the drive C? Mr._Joseph_Mack(Thu May 31 2007 20:55:28 GMT+0000 (GMT))> I have a drive with critical error 19 that won't boot and for which I have no documentation. What do I do to recover the drive? Shailendra(Thu May 31 2007 13:57:55 GMT+0000 (GMT))> Have you taken the back up of drive C? Mr._Joseph_Mack(Thu May 31 2007 20:59:17 GMT+0000 (GMT))> If I answer that will you answer the questions I've asked? Shailendra(Thu May 31 2007 13:59:50 GMT+0000 (GMT))> Yes, please let me know. Mr._Joseph_Mack(Thu May 31 2007 21:01:00 GMT+0000 (GMT))> OK I started asking questions first, so how about you answer my questions and then I'll answer yours. Shailendra(Thu May 31 2007 14:03:13 GMT+0000 (GMT))> For recovering the computer you need to have the back of drive C, so that we can recover the computer Shailendra(Thu May 31 2007 14:03:22 GMT+0000 (GMT))> We need to have the back up of drive C Mr._Joseph_Mack(Thu May 31 2007 21:04:12 GMT+0000 (GMT))> You're avoiding answering my questions. Shailendra(Thu May 31 2007 14:05:27 GMT+0000 (GMT))> I am answering your question how to recover the computer. Shailendra(Thu May 31 2007 14:05:42 GMT+0000 (GMT))> That is what you asked me, how to recover the computer? Mr._Joseph_Mack(Thu May 31 2007 21:05:53 GMT+0000 (GMT))> My question is how to recover the drive. Shailendra(Thu May 31 2007 14:06:50 GMT+0000 (GMT))> To recovering the drive you need to have the back up of drive C. Mr._Joseph_Mack(Thu May 31 2007 21:07:27 GMT+0000 (GMT))> That's how to recover the computer, not how to recover the drive. Mr._Joseph_Mack(Thu May 31 2007 21:07:55 GMT+0000 (GMT))> Is the drive recoverable, or has PM crashed it irrecoverably and you won't tell me. Shailendra(Thu May 31 2007 14:09:33 GMT+0000 (GMT))> All right. Please stay online while I move your session to my supervisor who will assist you further. This might take me 40-45 seconds. Is that okay with you? Mr._Joseph_Mack(Thu May 31 2007 21:11:57 GMT+0000 (GMT))> that would be great thanks Shailendra(Thu May 31 2007 14:10:08 GMT+0000 (GMT))> Please wait, while the issue is escalated to another analyst. analyst Shailendra has left room. analyst Vinod has entered room. Vinod(Thu May 31 2007 14:11:46 GMT+0000 (GMT))> Welcome Joseph. Vinod(Thu May 31 2007 14:12:47 GMT+0000 (GMT))> I understand from your previous interaction that you are unable to boot the computer after resizing the C: partition. Right? Mr._Joseph_Mack(Thu May 31 2007 21:13:08 GMT+0000 (GMT))> correct Vinod(Thu May 31 2007 14:14:41 GMT+0000 (GMT))> Do you have Partition Magic CD? Mr._Joseph_Mack(Thu May 31 2007 21:14:57 GMT+0000 (GMT))> yes. Vinod(Thu May 31 2007 14:16:27 GMT+0000 (GMT))> Please boot the computer from Partition Magic CD. Vinod(Thu May 31 2007 14:16:54 GMT+0000 (GMT))> Then use the PTedit utility to edit partition table. Vinod(Thu May 31 2007 14:17:08 GMT+0000 (GMT))> Please check if the C: partition type is shown as PQRP. Vinod(Thu May 31 2007 14:18:08 GMT+0000 (GMT))> If it is PQRp, please change it to NTFS or FAT32(Select the correct one.) Mr._Joseph_Mack(Thu May 31 2007 21:20:22 GMT+0000 (GMT))> I can boot from DOS disks. I don't have the machine up right now. I didn't see PQRp (I have seen this before and fixed it using postings I've seen on the internet). If it isn't PQRp, what should I do then? Vinod(Thu May 31 2007 14:23:17 GMT+0000 (GMT))> If it is not shown as PQRP, then nothing can be done. Mr._Joseph_Mack(Thu May 31 2007 21:23:51 GMT+0000 (GMT))> so the disk is irrecoverable? Is there any documenation on error 19? Vinod(Thu May 31 2007 14:25:47 GMT+0000 (GMT))> THere is no documentation is available regarding error 19. Mr._Joseph_Mack(Thu May 31 2007 21:26:19 GMT+0000 (GMT))> OK so the disk is irrecoverable and there's no documentation? inod(Thu May 31 2007 14:28:15 GMT+0000 (GMT))> I regret but no documentation is available. These errors are caused by corrupt Partition Table. Mr._Joseph_Mack(Thu May 31 2007 21:29:32 GMT+0000 (GMT))> OK. But I can see the partition with fdisk, so the start, end and size are OK. If not I can edit them by hand. Vinod(Thu May 31 2007 14:31:37 GMT+0000 (GMT))> I suggest you to use any Data Recovery software to recover your data. Mr._Joseph_Mack(Thu May 31 2007 21:32:26 GMT+0000 (GMT))> OK thank you Vinod(Thu May 31 2007 14:34:26 GMT+0000 (GMT))> Is there anything else I can help you with? Mr._Joseph_Mack(Thu May 31 2007 21:34:39 GMT+0000 (GMT))> no that's it. THanks
PM thinks I don't know to use my backups.
Vinod seemed to know what he was talking about and the problem is likely to be the partition table. If the files are recoverable with data recovery software, they're still there.
I don't know why I didn't think of this before, but I decided to look at the disk in in an external usb enclosure, under Linux. This ntfs work disk mounted perfectly, with the original partition size and all files visible. The 15G disk appeared to be full, although I knew that there were only 8G of files on the disk, so something was amiss at least to Linux. However I was able to pull off all the user's files. If I'd tried this first, I wouldn't have had to deal with Symantec or buy the updated version of PM.
At this stage I was home as far as the user was concerned. I wondered if I could recover the disk any further. Assuming there was something wrong with the partition table, I noted the cylinder values and copied over a w2k mbr (512byte).
dd if=/backup_dir/valid_w2k_mbr of=/dev/sda
I then ran fdisk on the device, returning the cylinders and fs type to the correct values. This didn't make any difference to Linux: the disk still appeared to be full. To Windows/PM the disk was just as hosed as before. The partition table wasn't the problem.
Somewhere in here I did something that made the disk look like it only had about 45MBytes of files rather than 8G. I can't remember what I did there.
DOS/Windows is quite touchy about the location of partition boundaries; I think they have to be on a cylinder. Linux seems to be able to put the boundary anywhere. When I run PM on a dual boot laptop, it crashes because it can't determine the drive letters. I assume that means it can't determine the locations of the ext3 partitions. PM instead should announce that it can't read primary partitions 2,3 and 4 and then let me work on the first partition, the C: drive. Symantec has its own version of fdisk called gdisk, part of the ghost package, that I haven't used either.
I decided to try the Windows fdisk. Turns out there isn't a Windows fdisk anymore. Microsoft doesn't want people out in user land doing such operations themselves - they have to come to Microsoft for that - don't you worry your pretty little head about partitioning. Windows will partition a disk from the startup cd, apparently the only time you need to partition a disk. I didn't realise at the time that the partitioning is followed immediately by a format. PM has a partition option (presumably it's running the Windows partitioning functionality), so I allowed it to partition my disk. I was only allowed to partition the whole disk. Partitions are measured in MBytes (and not integers like cylinders), so I noted the size in MBytes, to later return the partition to the size of the filesystem. Partitioning only writes 512bytes (one block) so it shouldn't take too long. I started the PM partitioning by hitting some button, and bars started moving. It showed progress for the partition operation and then to my surprise, since I hadn't asked it to format as well, progress for formatting. I wasn't ready for this, but hit some stop button, which put the PM into some delerium state and I couldn't tell what it was doing anymore. I tried to shutdown, but the shutdown kept telling me that there was a program not responding. which I eventually killed (unix doesn't have any problem shutting down when a program is not responding, well unless it's trying to unmount nfs disks exported from a machine that is shutdown). If I'd known what was going to happen, I would have pulled the usb cable. I rebooted, the disk was still visible, so apparently I hadn't hosed anything.
I reset the partition size using MBytes (I think with PM, but I don't know I did that, I didn't resize). Then I found I could mount the disk under Windows, I could run chkdsk on it and amazingly (considering that a few blocks must have been formatted), I found I could boot from it and it looked normal on cursory inspection. I'd (ta dah) recovered the disk.
The disk on defragging, was about 3/4 green (system) files. I assume the permission information is stored along with the directory information in the start of the disk. I wasn't going to go back and fix that.
I tried setting up a dos partition on a new copy of the work disk, using the (o option) to fdisk followed by copying back the mbr with
dd if=/backup_dir/my_w2k_mbr of=/dev/sda count=1 size=442
|size=442 is the mbr, the following bytes to 512 are the partition table.|
gave the "INACCESSIBLE_BOOT_DEVICE" error too. So the problem isn't in the mbr/partition table in the first 512 bytes of the disk. It seems there is something about Windows partitioning that fixes this problem.