Saturday, October 24, 2009

Scared Silly by Ubuntu and jfs

So I have a second internal HD, previously mounted and working, where all the pictures and data are stored, you know the important stuff. One morning I find the PC unresponsive, doesn't answer pings, blank screen (was running gnome), nothing on the virtual terminals (Ctl-Alt-F1, C.tl-Alt-F2). I reluctantly powered it down and rebooted.

When it comes up, those partitions (the important ones) on the HD won't mount! I believe the drive is functional, since /dev/sdb4 is mounted as swap and /dev/sdb3 is my old ubuntu partition.

My worst fear is I pulled the plug on a "non-responsive PC" that was in the middle of something useful and caused more damage (corrupting the drive).

Is the superblock corrupted?

Are the block id's (output from blkid) cached or stored on the disk?

First let's just calm down, and gather some data using the tools I have at my disposal:

# mount /dev/sdb1
mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

# blkid | grep sdb1
/dev/sdb1: UUID="3e4a975a-e771-4bd5-975b-bf27c019cac0" TYPE="jfs"

# grep 3e4a /etc/fstab
UUID=3e4a975a-e771-4bd5-975b-bf27c019cac0 /z jfs relatime 0 2

# dmesg | grep sdb
[ 1.937133] sd 0:0:1:0: [sdb] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[ 1.937154] sd 0:0:1:0: [sdb] Write Protect is off
[ 1.937157] sd 0:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[ 1.937192] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1.937263] sd 0:0:1:0: [sdb] 312581808 512-byte hardware sectors: (160 GB/149 GiB)
[ 1.937282] sd 0:0:1:0: [sdb] Write Protect is off
[ 1.937285] sd 0:0:1:0: [sdb] Mode Sense: 00 3a 00 00
[ 1.937319] sd 0:0:1:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1.937323] sdb: sdb1 sdb2 sdb3 sdb4
[ 1.985901] sd 0:0:1:0: [sdb] Attached SCSI disk
[ 11.461397] Adding 1959920k swap on /dev/sdb4. Priority:-1 extents:1 across:1959920k


First, let's try fsck again, which if the superblock is hosed (in the old days) then this is an act of desperation:

# fsck /dev/sdb1
fsck 1.41.4 (27-Jan-2009)
fsck: fsck.jfs: not found
fsck: Error 2 while executing fsck.jfs for /dev/sdb1


I guess I need jfsutils ...

# sudo apt-get install jfsutils
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed
... [snip] ...
Unpacking jfsutils (from .../jfsutils_1.1.12-2_i386.deb) ...


Rerun fsck... Et Viola! Success!

Holy crap! Success in 3 minutes? The stars are aligned. This has got to be a new record for:

cost_of_data_loss * (fear && anxiety)
-----------------------------------------------------
speed_of_solution


So my only complaint is that when a JFS partition is corrupted, why is it reported as a bad superblock?