[hfs-user] Identification of deleted files in HFS plus

Mark Day mday at apple.com
Wed Feb 8 09:39:23 PST 2006


On Feb 8, 2006, at 3:53 AM, Rashmi M wrote:

> 	I am writing an application on Mac to list out the files deleted  
> from the system (even from the trash). I have written code to read  
> the catalog file node by node. But I have no idea of how deleted  
> files are represented and how can I access them. Any guesses?

First of all, "the trash" is just a separate directory on the disk.   
Putting something in the trash merely moves the file or directory  
into the trash directory.  The file or directory doesn't actually get  
deleted until the user "empties" the trash.  The name and location of  
the trash directory has changed in various versions of Mac OS.  In  
Mac OS 9, the directory is named "Trash" and is in the volume's root  
directory; by convention, it has its "invisible" bit set in the  
Finder Info.  In Mac OS X, there is a directory named ".Trashes" in  
the volume's root directory; inside there are directories whose names  
are numeric: a user ID for each user who has a trash directory  
(they're created on demand).

When a file or directory is actually deleted, its record(s) are  
removed from the Catalog B-tree.  And if it had overflow extents  
(more than 3 extents for HFS, or more than 8 extents for HFS Plus)  
then the overflow extent records are removed from the Extents B- 
tree.  In Mac OS X 10.4.0 and later, a file or directory can have  
extended attributes stored in the Attributes B-tree; records for the  
deleted item would be removed from the Attributes B-tree as well.   
The space occupied by a file's forks is freed by clearing the  
corresponding bits in the allocation bitmap.

Trying to recover deleted files is problematic.  Many file systems,  
such as UFS, EXT, or FAT, will simply mark a directory entry as  
"deleted" by overwriting a small number of bytes; you may be able to  
restore those bytes to a non-deleted state and find some or all of  
the original file's information.  It's generally not that easy with  
HFS or HFS Plus.

In the B-trees, there are typically several records in a single  
node.  They're essentially an array of records.  If you delete a  
record in the middle of the node, the records that follow it get  
shuffled up to overwrite the original record, usually leaving no  
remnants of the original record.  If the record being deleted is the  
last one in the node, it can be deleted by merely decrementing the  
number of records, in which case it might be possible to recover the  
original record.  But with Mac OS X, we found that some non-Apple  
disk repair utilities were too aggressive in trying to recover valid- 
looking records in the unused portion of the node, so we began  
explicitly overwriting the newly freed space with zeroes.  So, if Mac  
OS X deleted a file, the original record will always be overwritten  
with other data (either other records, or zeroes).

So perhaps your best bet at recovery is if you can recognize the  
content of a file.  You could scan the volume's free allocation  
blocks looking for recognizable content.  But beware that a file may  
not have been stored contiguously on the media.  And the content may  
have been moved over time (especially with Mac OS X's adaptive hot  
file clustering), so you may see valid-looking content that is  
actually from an older version of the file.

-Mark



More information about the hfs-user mailing list