[hfs-user] Difference in Data types??
Fri, 29 Mar 2002 09:07:55 -0800
On Friday, March 29, 2002, at 03:40 AM, Biswaroop(External) wrote:
> Well in the MDB structure for an HFS volume the
> vol.drXTClpSiz /* clump size for extents overflow file */
> is 4 bytes long.
> Again in the Catalog Data Record structure the member
> filClpSize; /* file clump size */
> takes 2 bytes.
> Therefore when i assign the value of the first variable to
> the second I lose information.
I'm not sure why you're copying from one to the other. The drXTClpSiz
is the clump size for the extents B-tree only. Since the B-tree is used
in a very different way from typical user files, I don't see a reason to
try and set an ordinary file's clump size to be the same as one of the
I believe Apple's code sets the clump size in a catalog record to zero;
I think you can do the same. It turns out that having different clump
sizes for different files wasn't very useful. If an application really
wanted to make sure that a file was allocated in large contiguous
pieces, it was generally better to try and pre-allocate it in one giant
contiguous piece (or when allocating additional space, make the entire
allocation contiguous). At runtime, Apple's code just uses a
volume-wide default for ordinary files (i.e. ones with a catalog record).
> Please is there any simple formula to find out the
> extent file size and the catalog file size for a volume
> when we know before hand how many files have to be
> in that volume.
> For eg. if i know i have to write "X" files contained in
> "Y" number of directories.
> Then can i calculate what should be the volume's
> clump size for the extents overflow file and the catalog
Certainly no simple formula for the catalog B-tree. In part that is
because the size of the catalog is determined in part by the lengths of
the file and directory names (even more so on HFS Plus, where the keys
in index nodes are variable length). And for volumes that are modified
over time, the order of operations will affect the size of the B-tree in
complex ways. I'm sure you could come up with a statistical guess based
on average name lengths, average density of nodes (i.e. how "full" they
Your particular case of creating a CD is actually a much simpler
problem, and you can compute an exact answer if you want. Since the
files won't be modified over time, you can guarantee that they will not
be fragmented. That means you can get by with a minimal extents B-tree
containing no leaf records. That means a single allocation block (for
the header node; the other nodes are unused and should be filled with
Since you know the complete set of files and directories in advance, you
can build an optimal tree by packing as many leaf records in a node as
possible, and then moving to the next node. All it requires is knowing
the order that you will assign directory IDs to directories, and then be
able to sort the file and directory names for the items in a single
directory. That way you can predict the entire leaf sequence. Once you
know the number of leaf nodes, you can calculate the number of index
nodes that will be parents of the leaf nodes, and so on up the tree
until you get to a level containing exactly one node (the root). This
should be relatively easy for HFS because the records in index nodes are
constant size, so the calculation for each level should just be a simple
divide and round up. For HFS Plus, you would have to keep track of the
actual file or directory names since the length of the keys in index
nodes vary based on the name lengths.
If that's too complicated, you could always fall back to assuming a
constant size (maximum or average) for all of the records. Don't forget
that for thread records, the key is of fixed size but the data is
variable (since it contains a variable-length string).