Determine file type (binary or text)

Graham Fawcett fawcett at teksavvy.com
Wed Aug 13 14:28:43 EDT 2003


Trent Mick wrote:

>[Sami Viitanen wrote]
>  
>
>>Hello,
>>
>>How can I check if a file is binary or text?
>>
>>There was some easy way but I forgot it..
>>    
>>
>
>Generally I define a text file as "it has no null bytes". I think this
>is a pretty safe definition (I would be interested to hear practical
>experience to the contrary). 
>

Dangerous assumption. Even if many or most binary files contain NULs, it 
doesn't mean that they all do.

It is trivial to create a non-text file that has no NULs.

    f = open('no_zeroes.bin', 'rb')
    for x in range(1, 256):
        f.write(chr(x))
    f.close()

Sami, I would suggest that you need to stop thinking in terms of tools, 
and instead think in terms of the problem you're trying to solve. Why do 
you need to (or think you need to) determine whether a file is "binary" 
or "text"? Why would your application fail if it received a 
(binary/text) file when it expected a (text/binary) one?

My guess is that the trait you are trying to identify will prove not to 
be "binary or text", but something more application-specific.

-- Graham

P.S. Sami, it's very bad form to "make up" an e-mail address, such as 
<none at none.net>. I'm sure the owners of the none.net domain would agree. 
Can't you provide a real address?







More information about the Python-list mailing list