hex dump w/ or w/out utf-8 chars

Chris Angelico rosuav at gmail.com
Sun Jul 7 21:17:17 EDT 2013


On Mon, Jul 8, 2013 at 10:22 AM, blatt <ferdy.blatsco at gmail.com> wrote:
> Hi all,
> but a particular hello to Chris Angelino which with their critics and
> suggestions pushed me to make a full revision of my application on
> hex dump in presence of utf-8 chars.

Hiya! Glad to have been of assistance :)

> As I already told to Chris... critics are welcome!

No problem.

> # -*- coding: utf-8 -*-
> # px.py vers. 11 (pxb.py)   # python 2.6.6
> # hex-dump w/ or w/out utf-8 chars
> # Using spaces as separators, this script shows
> # (better than tabnanny)  uncorrect  indentations.
>
> # to save output > python pxb.py hex.txt > px9_out_hex.txt
>
> nLenN=3          # n. of digits for lines
>
> # chomp heaps and heaps of comments

Little nitpick, since you did invite criticism :) When I went to copy
and paste your code, I skipped all the comments and started at the
line of hashes... and then didn't have the nLenN definition. Posting
code to a forum like this is a huge invitation to try the code (it's
the very easiest way to know what it does), so I would recommend
having all your comments at the top, and all the code in a block
underneath. It'd be that bit easier for us to help you. Not a big
deal, though, I did figure out what was going on :)

>     sLineHex  =lF[n].encode('hex').replace('20','  ')

Here's the problem. Your hex string ends with "220a", and the
replace() method doesn't concern itself with the divisions between
bytes. It finds the second 2 of 22 and the leading 0 of 0a and
replaces them.

I think the best solution may be to avoid the .encode('hex') part,
since it's not available in Python 3 anyway. Alternatively (if Py3
migration isn't a concern), you could do something like this:

    sLineHexND=lF[n].encode('hex')     # ND = no delimiter (space)
    sLineHex  =sLineHexND # No reason to redo the encoding
    twentypos=0
    while True:
        twentypos=sLineHex.find("20",twentypos)
        if twentypos==-1: break # We've reached the end of the string
        if not twentypos%2: # It's at an even-numbered position, replace it
            sLineHex=sLineHex[:twentypos]+'  '+sLineHex[twentypos+2:]
        twentypos+=1
    # then continue on as before

>     sLineHexH =sLineHex[::2]
>     sLineHexL =sLineHex[1::2]
> [ code continues ]

Hope that helps!

ChrisA



More information about the Python-list mailing list