Data carving

 

Data carving , file carving or raw file recovery is a very powerful way to extract files and data from corrupted or damaged media.

Data carving, or file carving is a process of reading files without reference to an file system.  This is required when the media has been corrupted,  changed, or the disk is very damaged.  The technique can be applied to any type if disk that stores data on sector boundaries which includes camera memory as well as hard drives.  It is based on the fact that most files start with a recognisable data signature

CnW Recovery software (download here) recovers raw files in two different ways -

  • Unread sectors area scanned after a logical read of files, ie sectors will be scanned that have not been used for files or seen within the file system
  • Or as a complete disk scan, extracting all possible files. This will recover all files pointed to by the file system, including those that can be read as part of the operating system.

Extracting files from media, while ignoring the file structure is performed by looking for unique patterns in the data.  Typically, the first few bytes of a file give a good indication, for instance all PK zip files start with the characters PK. The software has a (growing) list of such signatures built into the code.  CnW Recovery goes much further than most recovery programs in that it will often generate a file name from meta data within the file. The list at the bottom of this page grows on a regular basis, so please contact us if a particular file type is required. Download the free demo program now and see the files that can be recovered.

Advanced data carving

    CnW has some advanced features that go beyond simple signature recognition.  These include the list below.  Each feature is dependent on the actual file format.

    • File verification
    • Generation of filenames based on content
    • File length verification
    • Reconstructing fragmented files - eg fragmented video recovery

How to read unallocated space?

    With each logical file system, eg NTFS, CD-ROM there is an option to read the unallocated space. The sequence is that the disk is read logically, and hence all used sectors are known. After this, all previously unread sectors are read, and each sector, or cluster is tested to see if it is a possible start of a file.  This based on the first few characters of a file, and then sometimes some rather more extensive tests looking for a certain type of data. For NTFS disks, any file compression will be detected and automatically expanded. It does not matter if just one file is compressed, or all files compressed, they will be detected.  It will even detect NTFS compressed files left behind on a FAT, MAC, or HPOFS disk .(This is more than many data recovery programs will achieve).

    One issue with raw recovery is that the start of the file can be easy to detect, but often the length is not clear.  CnW Recovery software will often continue adding to a file until a new unique start is found.  In these cases a file can be shown as many MBs, although the actual data is only 100K. Fortunately, in many cases, an application will read this file, and ignore erroneous data at the end. As the CnW Recovery software develops, where possible, files will be stored at the correct length. With continuous files, extraction rates can be extremely high, with fragmented files, there can be major problems. However, automatic file carving routines are being added merge fragments of common types of files to produce a compete readable file.

    With some file types, it is possible to determine if the data is still a valid data stream, and if incorrect data is detected, no more data will be added to the potential output file.

    The following files types are detected by CnW Recovery software. The list does grow on a regular basis, and we are happy to add new file types if sent details, and examples - please contact CnW with any requests. The number of file types being verified, and corrected is also increasing on a regular basis.

Fragmented files that may give problems with data carving

    Data carving should always be considered the last resort on data recovery as it has severe limitations.  These is no meta data, ie no name, date or valid size.  There is also a big issue with fragmentation.  Many files are not fragmented, some are almost always fragmented.  In particular, email files such as .EDB and .PST.  These are large files that grow over months or years - often to multi GB lengths.  Disk defragmentation may not help and to recover a complete file with data carving is unlikely.  Video files and photos can also be fragmented, and CnW has several routines that will attempt to reconstruct files from fragments

Terminating strings

    Some data carving programs search for terminating strings as well as starting strings.  CnW Recovery software does not use this technique as the results do not help.  CnW will detect file starts and continue until a new file start is found.  This can occasionally lead to files that are very large as they are padded with blank data.  However, as part of CnW advanced carving techniques, many files are validated and the lengths determined by analysis after recovery

Data Carving and Reiser FS

    Data carving does assume that all file start at the beginning of a sector (actually a cluster).  Reiser uses a very efficient method of storing data and this very often means that a file starts in the middle of a sector, and only uses a partial amount of the sector.  For this reason, knowledge of the file system is required to recover the files.

Deduplication

    Once the carving process has been completed, the log can be viewed and files that have the same hash value can be deduplicated.  It is very common for unallocated space to contain multiple copies of the same file.

File carving

    Many applications describe this process as data carving or file carving.  However, CnW takes it a few stages further.  One unusual, and very useful feature is the creation of meaningful file names and many files are verified for content structure and length.  This gives a much higher success rate than just a simple string match at the start of a file.  For handling fragmented files, look at the sections on advanced data carving and manual data carving

carving_options1

Searching for strings

    The forensic and commercial option enables multiple strings to be searched, either on their own, or whilst doing disk carving.  Multiple tables of strings can be saved for later use.

File names

    In addition to the list below, various files are recognised by content, rather than signature. An example is Macintosh eMails.  Again, this list will grow.  For some file such as Jpegs, Docs, MP3 and Zip files, often it is possible to add some file name details such as date of file.

    By scanning the complete disk, it is very common for multiple instances of a single file to be recovered.  This is due to the operating system moving files, or as a result of a defrag operation. Fortunately, these duplicates can be removed by using the Deduplication feature in the log.

 

 

 

 

 File extension

Format

Notes

3GP

Video format

Optional fragment processing

abc

Flow chart database file

 

ai

Adobe illustrator

 

aiff

Audio format

File length is determined from header

ani

Animated pointer file

 

atn

ATN files

 

avi

Audio Visual movie files

File length is determined from header, and file validated Fragmented files can be recovered

bmp

BMP Bitmap file

The recovered file is set to correct length, based on information in the header

cab

CAB files

Used for software distribution by Microsoft

cam

Casio camera

 

cbd

 

 

cpi

Clip Inf files for AVCHD

The files will be set to correct length

crw

Canon Raw Files

Will add name and correct file size

csv

Comma separated files

 

cur

Cursor file

 

dbf

DBase file

 

dbx

Outlook Express

 

dcm

Dicom medical images

 

doc

Microsoft Word

Many compound documents have the same signature, so there can be false matches.  Files start D0 CF 11 E0 etc

Will extract date and title

doc

Word 2

Many programs produce DOC files

docx

Office 2007 files

The files are part of a Zip file, author and date added to name.  Can be defragmented after data carving

drw

Micrografx designer format

 

emf

Enhanced meta file

 

exe

Microsoft EXE

 

fm3

FM3 files

 

fmt

Printer driver

 

fon

Font file

 

fp7

Filemaker Pro 7

 

gif

GIF graphics file

 

hlp

Help file

 

hlp

Help files

 

hqx

BinHex for Macintosh

 

htm

HTML file

 

ifo

IFO file from DVD video

 

indx

INDX files from NTFS

These are index files from the NTFS directory

jpeg

JPEG, picture files

Checked for validity. Adds date and camera name to file name, and verifies file length.Joins fragments

lha

LHA compression program

 

lnk

Link files from Windows

 

macr

Macintosh resource fork

 

mdb

Access database file

 

mft

NTFS Master File table blocks

This is not a format, but system data

mmm

Multimedia Movie file

 

mov

MOV movie files

File length of some versions is determined from header

mp3

MP3 audio files

Name is created from title

mp4

MP4 Audio files

Name is created from title

mpeg

Video MPEG files

 

msc

MSC file

 

msi

Microsoft installer

 

mts

AVCHD data files

The video and audio data for AVCHD video

nb7

NovaBack V 7

 

nsf

Lotus Notes

 

odt

Open Office

File name extracted from title, date and defragmentaion routines included

orf

Olympus Raw image files

Date extracted

pcx

Paintbrush file

 

pdf

Adobe Portable Document File

File name sometimes extracted

ppd

Printer description file

 

psd

Photoshop file

 

pst

Outlook express file

File length determined from header

qdf

Quicken accounts package

 

qtch

Quick time

 

rmi

MIDI sequence file

 

scf

Symphony spell checker

 

tif

TIFF files

Both Intel and Motorola are detected. The file dates is added to name, and the length is verified

url

Internet Shortcuts

Name extracted from URL

wav

Audio sound file

 

wb1

webshots image

 

wdp

Windows media photo

New Microsoft graphics format

wk3

Lotus spreadsheet

 

wk4

Lotus spreadsheet

 

wks

Lotus 123 spreadsheet

 

wmf

Windows Metafile - graphics

 

wmv

Wave sound files

 

wp5

Word Perfect 5

 

wpl

WPL file

 

xlb

Excel database extension file

 

xls

Excel spreadsheet

 

xlsx

Excel 2007

2007 Excel file

xml

XML File

 

zip

PK Zip and Winzip files

Name determined from first file in archive. In a corrupted Zip archive, complete files are extracted and stored in a new Zip file.  Can be defragmented after data carving

[CnW Recovery] [Downloads] [Purchase Now] [CnW Wizard] [User Manual] [Main menu] [Partitions] [Logs] [Hard drive recovery] [NTFS data recovery] [FAT data recovery] [Data carving] [exFAT] [CD ROM data recovery] [Photo Recovery] [Damaged disks] [Fragm'ted Files] [File Filter] [Deduplication] [File validation] [Deleted file recovery] [Macintosh] [Unix Recovery] [MTF .BKF] [CD and DVD output] [RAID disks] [Data repair] [Forensic DR] [Video recovery] [Forensic Tools] [What will it do?] [Product Details] [FAQ & Links] [Case Studies] [Technical Notes] [Updates] [Development] [Testimonials] [About us] [Site Map] [Contact Us]