Which is more Pythonic? (was: Detecting Binary content in files)

John Machin sjmachin at lexicon.net
Wed Apr 1 18:22:23 EDT 2009


On Apr 2, 2:10 am, John Posner <jjpos... at snet.net> wrote:
> Dennis Lee Bieber presented a code snippet with two consecutive statements
> that made me think, "I'd code this differently". So just for fun ... is
> Dennis's original statement or my "_alt" statement more idiomatically
> Pythonic? Are there even more Pythonic alternative codings?
>
>    mrkrs = [b for b in block
>      if b > 127
>        or b in [ "\r", "\n", "\t" ]       ]

I'd worry about "correct" before "Pythonic" ... see my responses to
Dennis in the original thread.

>
>    mrkrs_alt1 = filter(lambda b: b > 127 or b in [ "\r", "\n", "\t" ],
> block)
>    mrkrs_alt2 = filter(lambda b: b > 127 or b in list("\r\n\t"), block)

Try this on and see if it fits:

num_bin_chars = sum(b > "\x7f" or b < "\x20" and b not in "\r\n\t" for
b in block)

> (Note: Dennis's statement converts a string into a list; mine does not.)

What is list("\r\n\t") doing, if it's not (needlessly) converting a
string into a list?

> ---
>
>    binary = (float(len(mrkrs)) / len(block)) > 0.30
>
>    binary_alt = 1.0 * len(mrkrs) / len(block) > 0.30
>

num_bin_chars > 0.30 * len(block)

(no mucking about with float() or 1.0, and it doesn't blow up on a
zero-length block)

Cheers,
John



More information about the Python-list mailing list