Python and lost files

Dave Angel davea at ieee.org
Fri Oct 2 06:53:56 EDT 2009


Carl Banks wrote:
> On Sep 30, 11:35 pm, "Timothy W. Grove" <tim_gr... at sil.org> wrote:
>   
>> Recently I purchased some software to recover some files which I had
>> lost. (A python project, incidentally! Yes, I should have kept better
>> backups!) They were nowhere to found in the file system, nor in the
>> recycle bin, but this software was able to locate them and restore them.
>>     
>
> I could have used that yesterday, if it were able to work for a
> network Samba drive.  (Yeah, not likely.)
>
>
>   
>> I was just wondering if there was a way using python to view and recover
>> files from the hard drive which would otherwise remain lost forever?
>>     
>
> Obviously, if that program was able to do it, it's possible.
>
> On Unix-like OSes, and probably others, it's possible to read the raw
> data on a disk the same way as you would read any file.  So Python can
> do it without any system-level programming.  Recent versions (I think
> 2.6+) can use mmap, too, now that it supports an offset parameter.
>
> I don't think you can do that in Windows, though.  I think you'd have
> to use special system calls (via ctypes, for example).
>
>
> Carl Banks
>
>   
To write such a program, you have two challenges.  First is to get 
read-access to the raw sectors of the partition, and second, to analyze 
them to discover which ones are interesting, and how they need to be 
combined to reconstruct the lost data.

In Windows, the first challenge is pretty easy for drives other than the 
system drive (usually drive C:, but not necessarily.)  You use one of 
the following:
    \\.\X:    where X: is the logical drive letter
   or   \\.\PhysicalDriveN   where N is the hard drive # (0, 1, 2...)   
Normally you'd use this only if the data is on a "deleted" or "foreign" 
partition that Windows doesn't recognize.

Naturally, make sure the scratch files and result files you create are 
going to a different partition/drive.

The second challenge is the file system format.  If you go with physical 
drive, you'll have to parse the partitioning information to find 
individual partitions within the drive, and once you get to a partition, 
you have to parse the particular file system.  Most likely NTFS (which 
has had several versions).  But could be FAT32, FAT16, or a couple of 
other less likely candidates.

While you can also do this for a system partition, there are some 
challenges that I've no relevant experience with.

DaveA




More information about the Python-list mailing list