[hfs-user] Identification of deleted files in HFS plus
mday at apple.com
Wed Feb 8 09:39:23 PST 2006
On Feb 8, 2006, at 3:53 AM, Rashmi M wrote:
> I am writing an application on Mac to list out the files deleted
> from the system (even from the trash). I have written code to read
> the catalog file node by node. But I have no idea of how deleted
> files are represented and how can I access them. Any guesses?
First of all, "the trash" is just a separate directory on the disk.
Putting something in the trash merely moves the file or directory
into the trash directory. The file or directory doesn't actually get
deleted until the user "empties" the trash. The name and location of
the trash directory has changed in various versions of Mac OS. In
Mac OS 9, the directory is named "Trash" and is in the volume's root
directory; by convention, it has its "invisible" bit set in the
Finder Info. In Mac OS X, there is a directory named ".Trashes" in
the volume's root directory; inside there are directories whose names
are numeric: a user ID for each user who has a trash directory
(they're created on demand).
When a file or directory is actually deleted, its record(s) are
removed from the Catalog B-tree. And if it had overflow extents
(more than 3 extents for HFS, or more than 8 extents for HFS Plus)
then the overflow extent records are removed from the Extents B-
tree. In Mac OS X 10.4.0 and later, a file or directory can have
extended attributes stored in the Attributes B-tree; records for the
deleted item would be removed from the Attributes B-tree as well.
The space occupied by a file's forks is freed by clearing the
corresponding bits in the allocation bitmap.
Trying to recover deleted files is problematic. Many file systems,
such as UFS, EXT, or FAT, will simply mark a directory entry as
"deleted" by overwriting a small number of bytes; you may be able to
restore those bytes to a non-deleted state and find some or all of
the original file's information. It's generally not that easy with
HFS or HFS Plus.
In the B-trees, there are typically several records in a single
node. They're essentially an array of records. If you delete a
record in the middle of the node, the records that follow it get
shuffled up to overwrite the original record, usually leaving no
remnants of the original record. If the record being deleted is the
last one in the node, it can be deleted by merely decrementing the
number of records, in which case it might be possible to recover the
original record. But with Mac OS X, we found that some non-Apple
disk repair utilities were too aggressive in trying to recover valid-
looking records in the unused portion of the node, so we began
explicitly overwriting the newly freed space with zeroes. So, if Mac
OS X deleted a file, the original record will always be overwritten
with other data (either other records, or zeroes).
So perhaps your best bet at recovery is if you can recognize the
content of a file. You could scan the volume's free allocation
blocks looking for recognizable content. But beware that a file may
not have been stored contiguously on the media. And the content may
have been moved over time (especially with Mac OS X's adaptive hot
file clustering), so you may see valid-looking content that is
actually from an older version of the file.
More information about the hfs-user