Determine file type (binary or text)

Michael Peuser mpeuser at web.de
Wed Aug 13 08:23:39 EDT 2003


Hi,
yes there is more than just Unix in the world ;-)
Windows directories have no means to specify their contents type in any way.
The approved method is using three-letter extensions, though this rule  is
not strictly followed (lot of files without extension nowadays!)

When I had a similar problem I read 1000 characters, counted the amount of
<32 and >255 characters and classified it "binary when this qota exceeded
20%. I have no idea whether it will work good with chinese unicode files or
some funny depositories or project files that store uncompressed texts....

KIndly
Michael P

"Sami Viitanen" <none at none.net> schrieb im Newsbeitrag
news:v7p_a.1558$k4.32814 at news2.nokia.com...
> Works well in Unix but I'm making a script that works on both
> Unix and Windows.
>
> Win doesn't have that 'file -bi' command.
>
> "bromden" <bromden at gazeta.pl.no.spam> wrote in message
> news:bhd559$ku9$1 at absinth.dialog.net.pl...
> > > How can I check if a file is binary or text?
> >
> >  >>> import os
> >  >>> f = os.popen('file -bi test.py', 'r')
> >  >>> f.read().startswith('text')
> > 1
> >
> > (btw, f.read() returns 'text/x-java; charset=us-ascii\n')
> >
> > --
> > bromden[at]gazeta.pl
> >
>
>






More information about the Python-list mailing list