Tuesday, October 21, 2014

Why directory size are different

It's the size necessary to store the meta-data about files (including the file names contained in that directory). The number of files / sub-directories at a given time might not map directly to the size reported, because once allocated, space is not freed if the number of files changes.
This behavior makes sense for most use cases (where disk space is cheap, and once a directory has a lot of files in it, it will probably have them again in future), and helps to reduce fragmentation.


[root@pgvmdc ~]# mkdir test1

[root@pgvmdc ~]# ls -ltr | grep test1
drwxr-xr-x 2 root root       4096 Oct 21 16:38 test1
default directory size
[root@pgvmdc ~]# cd test1

[root@pgvmdc test1]# pwd
/root/test1

[root@pgvmdc test1]# for i in {0..1000}; do echo "hello, this is a test only " > $i; done;
created 1000 files under test1
[root@pgvmdc test1]# ls -ltr ../ | grep test1
drwxr-xr-x 2 root root      20480 Oct 21 16:58 test1
change in the directory size

[root@pgvmdc test1]# rm -rf *

[root@pgvmdc test1]# ls -ltr ../ | grep test1
drwxr-xr-x 2 root root      20480 Oct 21 16:58 test1


Here you see that when even empty files are created, the directory size increases - the file names and meta data must be stored somewhere, and it is in the directory object itself. In this case the file names and other meta-data fit in 5 x 4k blocks (20480 = 4096 * 5).

Deleting the files does not reduce the space used by the directory object - only removing the directory and re-creating it frees the space.