For Office 2007 Microsoft changed the file structure. The file is now based on XML sections all zipped into a file. In the operating system, the files are recognised with a 4 byte file extension such as .docx, xlsx. Office 2010 use a compatible format and structure.
When it comes to raw recovery, or carving for these files, they all have the same basic file signature, which is PK 0x3 0x4, the same as a Zip file. Cnw Recovery software does go a few stages further and analyses the file content to determine the type of file. The files it will detect are as below. These are found by expanding the Zip file and examining the XML sections within the file. Strings such as
indicate the file type, in this case a DOCX file.
For more details see http://msdn.microsoft.com/en-us/library/bb264572(office.12).aspx
docx
|
Word document
|
docm
|
Macro enabled document
|
dotx
|
Word template
|
dotm
|
Word macro enabled template
|
xlsx
|
Excel worksheet
|
xltx
|
Excel template
|
xlsm
|
Excel macro enabled worksheet
|
xltm
|
Excel macro enables template
|
pptx
|
Power Point
|
pptm
|
Power point, macro enabled
|
ppsx
|
Power point slide show
|
|
When possible the name created on carving in constructed using the author’s name and the person who last edited. The date of the file is set to be last modifed date, though this is not added to the file name.
|