File System Recovery vs. RAW Recovery

This article explains RAW Recovery vs. File System Recovery. File System Recovery is able to preserve the directory tree. Some times RAW Recovery is the only option.

To recover data from a victim disk two methods are available to data recovery software. They can try to detect files using still available file system structures or file system recovery. Or they can do file carving or RAW recovery.

File System Recovery.

File System Recovery using file system structures: The data recovery software will scan the disk for things like boot sectors, directory structures, indexes, File Allocation Tables (FAT), and MFT entries. Then it has to combine that info to figure out volume parameters. It needs to figure out where a volume starts, the size of the volume and the cluster size.

Image 1: A file record in the MFT

If it can not figure out for example the cluster size, all references to clusters are meaningless. If for file X file system structures point to cluster N then the software needs to know that start of the volume and the cluster size.

File recovery using file system structures often allows the software to also determine the original folder structure. In NTFS it is possible to reconstruct a directory tree even without indexes, purely by using on information that can be found in MFT entries.

Image 2: File system based recovery: Files names and directory structure are recovered too.

RAW Recovery.

RAW Recovery or file carving is possible even then the file system or it’s properties such as cluster size are unknown. It relies on knowledge of actual file properties. Many file types start with an easy to recognize sequence of bytesalso called the ‘Magic Numbers’. For example, GIF image files all start with ASCII code for “GIF” (in Hex 47 49 46) .

Image 3: Magic number at the start of a GIF file

So all the software has to do is check each sector for the occurrence of this string to know it has found the start of a GIF file. Easy to detect magic numbers like these exist for a lot of file types, but not all. Also these are not exclusive strings. It is very much possible and even likely that byte sequence 47 49 46 can be found a lot of times on a hard disk. Even for software performing RAW recovery or file carving it is an advantage to know something about the file system.

Knowing where the file system starts and knowing the cluster size limits the places where to look for magic numbers as they will be at the start of a cluster. Not knowing the start of the file system and cluster size means it has to read each sector and see if it finds a magic number at the start of the sector.

Image 4: Same disk scanned as in Image 1. This software fell back to a RAW scan and is organizing files by file type.

Drawbacks of RAW File Recovery.

One major drawback of RAW recovery is that the files do not retain their original file name. The original folder structure is unknown. Instead the software organizes files by file type.

Second disadvantage of RAW file carving is that fragmented files will be corrupt to some degree after recovery. The software assumes that when it finds a magic number for a specific file type, the rest of the file will follow in one piece.

Digital Image recovery software often uses the RAW file carving technique, which isn’t really a problem as filenames have no relation to to image subject. They are more or less generic (image011, image012 etc.). The files are also often in one single directory on the memory card, so recovering a directory structure is not a requirement.

During my tests of data recovery software I found that the majority of the software is very weak at recovering using file system structures. As kind of a fall back system they then use RAW file carving to produce results.


Originally published at www.disktuna.com on October 19, 2016.