Data carving , file carving or raw file recovery is a very powerful way to extract files and data from corrupted or damaged media.
Data carving, or file carving is a process of reading files without reference to an file system. This is required when the media has been corrupted, changed, or the disk is very damaged. The technique can be applied to any type if disk that stores data on sector boundaries which includes camera memory as well as hard drives. It is based on the fact that most files start with a recognisable data signature
CnW Recovery software (download here) recovers raw files in two different ways -
- Unread sectors area scanned after a logical read of files, ie sectors will be scanned that have not been used for files or seen within the file system
- Or as a complete disk scan, extracting all possible files. This will recover all files pointed to by the file system, including those that can be read as part of the operating system.
Extracting files from media, while ignoring the file structure is performed by looking for unique patterns in the data. Typically, the first few bytes of a file give a good indication, for instance all PK zip files start with the characters PK. The software has a (growing) list of such signatures built into the code. CnW Recovery goes much further than most recovery programs in that it will often generate a file name from meta data within the file. The list at the bottom of this page grows on a regular basis, so please contact us if a particular file type is required. Download the free demo program now and see the files that can be recovered.
Advanced data carving
CnW has some advanced features that go beyond simple signature recognition. These include the list below. Each feature is dependent on the actual file format.
- File verification
- Generation of filenames based on content
- File length verification
- Reconstructing fragmented files - eg fragmented video recovery
How to read unallocated space?
With each logical file system, eg NTFS, CD-ROM there is an option to read the unallocated space. The sequence is that the disk is read logically, and hence all used sectors are known. After this, all previously unread sectors are read, and each sector, or cluster is tested to see if it is a possible start of a file. This based on the first few characters of a file, and then sometimes some rather more extensive tests looking for a certain type of data. For NTFS disks, any file compression will be detected and automatically expanded. It does not matter if just one file is compressed, or all files compressed, they will be detected. It will even detect NTFS compressed files left behind on a FAT, MAC, or HPOFS disk .(This is more than many data recovery programs will achieve).
One issue with raw recovery is that the start of the file can be easy to detect, but often the length is not clear. CnW Recovery software will often continue adding to a file until a new unique start is found. In these cases a file can be shown as many MBs, although the actual data is only 100K. Fortunately, in many cases, an application will read this file, and ignore erroneous data at the end. As the CnW Recovery software develops, where possible, files will be stored at the correct length. With continuous files, extraction rates can be extremely high, with fragmented files, there can be major problems. However, automatic file carving routines are being added merge fragments of common types of files to produce a compete readable file.
With some file types, it is possible to determine if the data is still a valid data stream, and if incorrect data is detected, no more data will be added to the potential output file.
The following files types are detected by CnW Recovery software. The list does grow on a regular basis, and we are happy to add new file types if sent details, and examples - please contact CnW with any requests. The number of file types being verified, and corrected is also increasing on a regular basis.
Fragmented files that may give problems with data carving
Data carving should always be considered the last resort on data recovery as it has severe limitations. These is no meta data, ie no name, date or valid size. There is also a big issue with fragmentation. Many files are not fragmented, some are almost always fragmented. In particular, email files such as .EDB and .PST. These are large files that grow over months or years - often to multi GB lengths. Disk defragmentation may not help and to recover a complete file with data carving is unlikely. Video files and photos can also be fragmented, and CnW has several routines that will attempt to reconstruct files from fragments.
Some data carving programs search for terminating strings as well as starting strings. CnW Recovery software does not use this technique as the results do not help. CnW will detect file starts and continue until a new file start is found. This can occasionally lead to files that are very large as they are padded with blank data. However, as part of CnW advanced carving techniques, many files are validated and the lengths determined by analysis after recovery
Data Carving and Reiser FS
Data carving does assume that all file start at the beginning of a sector (actually a cluster). Reiser uses a very efficient method of storing data and this very often means that a file starts in the middle of a sector, and only uses a partial amount of the sector. For this reason, knowledge of the file system is required to recover the files.
Once the carving process has been completed, the log can be viewed and files that have the same hash value can be deduplicated. It is very common for unallocated space to contain multiple copies of the same file.
Many applications describe this process as data carving or file carving. However, CnW takes it a few stages further. One unusual, and very useful feature is the creation of meaningful file names and many files are verified for content structure and length. This gives a much higher success rate than just a simple string match at the start of a file. For handling fragmented files, look at the sections on advanced data carving and manual data carving
Searching for strings
The forensic and commercial option enables multiple strings to be searched, either on their own, or whilst doing disk carving. Multiple tables of strings can be saved for later use.
In addition to the list below, various files are recognised by content, rather than signature. An example is Macintosh eMails. Again, this list will grow. For some file such as Jpegs, Docs, MP3 and Zip files, often it is possible to add some file name details such as date of file.
By scanning the complete disk, it is very common for multiple instances of a single file to be recovered. This is due to the operating system moving files, or as a result of a defrag operation. Fortunately, these duplicates can be removed by using the Deduplication feature in the log.