Detecting Binary content in files

Josh Dukes josh.dukes at microvu.com
Tue Mar 31 13:35:56 EDT 2009


or rather:

#!/usr/bin/env python
import string

def isbin(filename):
   fd=open(filename,'rb')
   for b in fd.read():
      if not b in string.printable and b not in string.whitespace:
         fd.close()
         return True
   fd.close()
   return False

for f in ['/bin/bash', '/etc/passwd']:
   print "%s is binary: " %f, isbin(f)


whatever... basically it's what everyone else said, every file is
binary so it all depends on your definitiion of binary. 

On Tue, 31 Mar 2009 10:23:51 -0700
Josh Dukes <josh.dukes at microvu.com> wrote:

> s/if ord(b) > 127/if ord(b) > 127 or ord(b) < 32/
> 
> 
> On Tue, 31 Mar 2009 10:19:44 -0700
> Josh Dukes <josh.dukes at microvu.com> wrote:
> 
> > There might be another way but off the top of my head:
> > 
> > #!/usr/bin/env python
> > 
> > def isbin(filename):
> >    fd=open(filename,'rb')
> >    for b in fd.read():
> >        if ord(b) > 127:
> >            fd.close()
> >            return True
> >    fd.close()
> >    return False
> > 
> > for f in ['/bin/bash', '/etc/passwd']:
> >    print "%s is binary: " % f, isbin(f)
> > 
> > 
> > Of course this would detect unicode files as being binary and maybe
> > that's not what you want. How are you thinking about doing it in
> > perl exactly? 
> > 
> > 
> > On Tue, 31 Mar 2009 09:23:05 -0700 (PDT)
> > ritu <ritu_bhandari27 at yahoo.com> wrote:
> > 
> > > Hi,
> > > 
> > > I'm wondering if Python has a utility to detect binary content in
> > > files? Or if anyone has any ideas on how that can be
> > > accomplished? I haven't been able to find any useful information
> > > to accomplish this (my other option is to fire off a perl script
> > > from within m python script that will tell me whether the file is
> > > binary), so any pointers will be appreciated.
> > > 
> > > Thanks,
> > > Ritu
> > > --
> > > http://mail.python.org/mailman/listinfo/python-list
> > 
> > 
> 
> 


-- 

Josh Dukes
MicroVu IT Department



More information about the Python-list mailing list