check if file is MS Word or PDF file

Sean DiZazzo half.italian at gmail.com
Sat Sep 27 19:37:57 EDT 2008


On Sep 27, 4:01 pm, "Chris Rebert" <c... at rebertia.com> wrote:
> On Sat, Sep 27, 2008 at 3:42 PM, Michael Crute <mcr... at gmail.com> wrote:
> > On Sat, Sep 27, 2008 at 5:43 PM, A. Joseph <joefa... at gmail.com> wrote:
> >> What should I look for in a file to determine whether or not it is a
> >> MS Word file or an Excel file or a PDF file, etc., etc.? including Zip
> >> files
>
> >> I don`t want to check for file extension.
> >> os.path.splitext('Filename.jpg') will produce a tuple of filename and
> >> extension, but some file don`t even have extension and can still be read by
> >> MS Word or NotePad. i want to be 100% sure of the file.
>
> > You could use the mimetypes module...
>
> > <<< import mimetypes
> > <<< mimetypes.guess_type("LegalNotices.pdf")
> >>>> ('application/pdf', None)
>
> Looking at the docs for the mimetypes module, it just guesses based on
> the filename (and extension), not the actual contents of the file, so
> it doesn't really help the OP, who wants to make sure their program
> isn't misled by an inaccurate extension.
>
> Regards,
> Chris
> --
> Follow the path of the Iguana...http://rebertia.com
>
>
>
> > -mike
>
> > --
> > ________________________________
> > Michael E. Crute
> >http://mike.crute.org
>
> > God put me on this earth to accomplish a certain number of things.
> > Right now I am so far behind that I will never die. --Bill Watterson
> > --
> >http://mail.python.org/mailman/listinfo/python-list

Check http://sourceforge.net/project/showfiles.php?group_id=23617

for the 'file' command for Windows.

~Sean



More information about the Python-list mailing list