Code to recognize MS-Word document files?

Christos TZOTZIOY Georgiou tzot at sil-tec.gr
Tue Mar 4 18:46:06 EST 2003


On Tue, 04 Mar 2003 17:51:53 +0100, rumours say that "Martin v. Löwis"
<martin at v.loewis.de> might have written:

>The GNU file command can do this recognition, atleast partially. I'm not 
>aware of a Python wrapper around it, but it shouldn't be too difficult.

I got a module magic.py that is accessible from
<URL:http://www.sil-tec.gr/~tzot/python/>.  It provides for a file_magic
function using a copy of /etc/magic (or /usr/share/magic I think in
Linux).  The code needs cleaning, but is usable (the only functionality
I did not implement is offset > 0 peeking).  I also have this file.py in
my win2k path:

import sys, os
from tzot.magic import file_magic
from glob import glob

for arg in sys.argv[1:]:
    for filename in glob(arg):
        if os.path.isdir(filename):
            print "%s: folder" % filename
        else:
            print "%s: %s" % (filename, file_magic(filename))

Usual disclaimers apply.
-- 
TZOTZIOY, I speak England very best,
bofh at sil-tec.gr
(I'm a postmaster luring spammers; please spam me!
...and my users won't ever see your messages again...)




More information about the Python-list mailing list