Code to recognize MS-Word document files?

WP warrenpstma at _______.com.hotmail
Tue Mar 4 12:01:33 EST 2003


Grant Edwards wrote:
> I'm looking for a snippet of python that I can use to determine
> if a file is a MS-Word document.  People around here seem to
> have gotten into the habit of attaching MS-Word files without a
> ".doc" on the name.  
> 
> Even with the .doc, the mimetypes module doesn't seem to get it
> right.  :(
> 

Extremely Bogus Hack: (probably no help, sorry!)

	f=open(wordfilename,'rb')
	str1=f.read(10) # arbitrary # of bytes
	for i in str1:
		hex = hex + ( '%02x ' % Ord(i) )
	if hex='d0 cf 11 e0 a1 b1 1a e1 00 00 ':
		print 'match'
	// My word files all seem to start like this
	// (MS Office 2000 WORD iles)
	//  d0 cf 11 e0 a1 b1 1a e1 00 00 00 00 00 00
	//  00 00 00 00 00 00 00 00 00 00
	// Your mileage may vary.

Warren
-- 
--------------------------------------
warren.postma at adaptivenetworks.on.ca
Toronto Ontario Canada






More information about the Python-list mailing list