NTFS disks have the option to compress files. This saves disk space, but make recovery of corrupted disks a bit more complex. Fortunately, CnW Recovery Software does have tools to make this a straight forward task and can often restore compressed files that other data recovery packages miss altogether. This can make CnW a very useful forensic tool when searching disks that have had a history of being reformatted.
When a disk is being recovered making use of existing directory structure, or directory stubs (stored in the MFT blocks) then details of compression is known. For raw reading of a disk, each cluster group is analysed to determine if the data is compressed.
How NTFS compresses files
NTFS compresses data on a file by file basis. The compression is a slightly modified LZ77 and is both fast to compress and expand, but is not as efficient as the slower LZW routines such as used by PKZIP. As files have to be random access, even when compressed, the method chosen is to compress in 4K blocks only. Again, this is a trade off between maximum compression and speed with random access of a file. It is generally true that the longer the compression block, the higher the rate of compression as it is more likely to come across a string that has already been seen. Looking at a sector that has been compressed, it will be noted that often the first few words of text do not look compressed as no pattern has been repeated. At the end of the block, data is largely unreadable by eye.
NTFS has several features to ensure that compression does not actually expand the data being written. This can be a problem when attempting to compress data that has already been compressed by an efficient routine. On an NTFS disk, data is stored in clusters, and a cluster is typically 2K, or 4K in size. (512 bytes, 1K, 8K, 16K are all possible, but not common). NTFS will attempt to compress a stream of 16 clusters, up to a maximum of 64K. Thus compression is not possible if the cluster size is greater than 4K, or 8 sectors.
With a 4K cluster, each 4K will be attempted to be compressed. If a 4K block does not compress, it will be marked with a 2 byte header and the data will be left uncompressed. If when all 16 clusters are added up they still take up 16 clusters, the data will be left totally uncompressed. However, if there can be a saving of at least one cluster, this saving is made. Text files, Word files etc often reduce to about 50% of size, while JPEGs only have a very small compression, and this is normally only on the first cluster group. For a 4K cluster the compression will be 16 clusters being a total of 64K. For a 2K cluster, it will still be 16 clusters, giving 32K.
The Cnw View sector function (download here) has an expand button that will expand sectors, as long as the first sector is the start of a compression cluster. The full length of the expanded data is displayed, up to the 64K maximum.
How sectors are expanded on a raw read
It is fairly easy to detect if a cluster does contain compressed data. This can be a mixture of the first few characters, and by following the chain of each cluster through the 16 cluster block. If the data is compressed, then an expanded version is created and the normal raw recovery routines are processed, helping to extract files.
How to recovery compressed NTFS files from unallocated disk space.
There are two ways to extract data, firstly as part of a normal restore, or by using the Image option and selecting file splitting with compression expansion.
Each restore option for files, eg FAT, NTFS has an options for restoring unallocated space. This is normally performed after a standard restore so that sectors that are part of defined file are not re-read.
The other mode is to select the Image Restore function. This is described fully in the raw recovery section
In summary, the user can extract data from a mixed compressed and uncompressed platter without needing to understand anything about compression routines. Cnw Recovery software is a powerful tool to carve NTFS compressed disks