hfs_fs-0.95: hfs_hfind: corrupted b-tree 4

Pat Dirks hfs-interest@ccs.neu.edu
Thu, 21 Jan 1999 17:08:45 -0800


Hi,

The internal format of HFS disks is described in considerable detail in 
Apple's "Inside Macintosh: Files", published by Addison-Wesley.  A 
must-read if you plan to modify a catalog B*-Tree by hand...

>Corresponding to the '..'  directory entry, each group of files
>belonging to the same directory is preceded by a special type of entry
>called a thread entry.  It's key is just the ID number of the
>directory, with an empty filename appended.  It holds the ID of the
>directory's parent.

Actually, it holds the complete key by which the directory is known: the 
parent's dirID and the name within the parent's directory.  It allows the 
filesystem to locate the catalog record describing the directory itself 
(its mod. date, etc.)

>What appears likely is that one of more "pointers" in the tree have
>become corrupted and so the thread records for the "missing"
>directories are inaccessible.  This surely means that the entries for
>at least some of the entries in those directories are also
>inaccessible, but not necessarily ALL of them.  It may still be
>possible to access some files or directories in the "missing" ones by
>name, but not by 'ls', 'find', 'du' or anything else which tries to do
>the equivalent of an 'ls' in the "missing" directories.  If you know
>the names of files under the "missing" directories, then give it a
>try.

The damage can be on two levels: in the B*-Tree itself (an index node 
pointing to two subnodes that aren't linked along their forward and 
backward links, for instance) or in the structure of the B*-Tree records 
(missing thread records for directories, for instance).

Without knowing more I wouldn't speculate on the kind of damage your disk 
may have sustained.  Machine crashes do different damage than hard disk 
bad blocks, for instance.

Locating an object by DirID and Name involves doing a traversal of 
successive levels of index nodes in the B*-Tree in order to locate the 
leaf node that holds the actual record for the object in question.  
Opening a folder involves locating the thread record of that directory 
(because it always immediately precedes the record for the first object 
in the directory), followed by a traversal of successive forward links to 
enumerate the contents of the directory.  It is possible to end up in a 
situation as described above although the reverse is more common: the 
leaf links are intact allowing a directory to be enumerated but, even 
though the contents are plainly visible in the Finder window, trying to 
do anything to any of them results in a "File Not Found" because the 
index node superstructure is corrupted and the file cannot be found by 
its ID/Name key.

>In principle the "extents tree", which holds to info on what blocks
>are used by what files, should still know about the files with
>inaccessible catalog entries.  If an fsck existed, it would do as
>Michael Knox suggests: find all catalog entries is COULD get to and
>then find al the extents that weren't accounted for by the accesible
>catalog entries and make them into files again, thought there names,
>types, and otehr vital statistics would be lost.  This is what Norton
>tries to do too, but I imagine that the catalog tree is too badly
>damaged for it to be able to create any entries for the lost files.  I
>think you should assume that is the case any not try to create of
>delete any files or directories on the disk for fear of further
>corrupting the catalog tree.

Unfortunately (from a recovery standpoint) not all extent information is 
in the extents B*-Tree.  In fact, it's somewhat rare; the first 3 extents 
of any file are stored in the catalog record for the file itself.  It's 
only when a file allocates a fourth extent that extent records are 
created in the extents B*-Tree (which is really an extents OVERFLOW 
tree).  If it helps any, the resource header (in the first block of a 
resource fork) contains the file's name and some of the Finder info.  You 
can tell resource fork extents because there's a flag byte in the extents 
B*-Tree key that's different for data fork blocks and resource fork 
blocks.

>My only USEFULL suggestion is to be sure you have the newest Norton
>Utilities for Mac which you can find.  There is always the hope that a
>newer version can fix the problem.

That's certainly sound advice.  Looking over the remains of a corrupted 
hard disk can be educational and fun but the repairs are not for the 
faint of heart.  I wouldn't dive in without reading some source for 
information on the volume format, and the "Inside Macintosh" volume does 
a remarkably thorough job.

Hope that helps,
-Pat Dirks.