Hex editor display - can this be more pythonic?

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Sun Jul 29 15:53:55 EDT 2007


On Sun, 29 Jul 2007 12:24:56 -0700, CC wrote:

> ln = '\x00\x01\xFF 456\x0889abcde~'
> import sys
> for c in ln:
>      sys.stdout.write( '%.2X ' % ord(c) )
> 
> or this:
> 
> sys.stdout.write( ' '.join( ['%.2X' % ord(c) for c in ln] ) + '  ' )
> 
> Either of these produces the desired output:
> 
> 00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E
> 
> I find the former more readable and simpler.  The latter however has a 
> slight advantage in not putting a space at the end unless I really want 
> it.  But which is more pythonic?

I would use the second with fewer spaces, a longer name for `ln` and in
recent Python versions with a generator expression instead of the list
comprehension:

sys.stdout.write(' '.join('%0X' % ord(c) for c in line))

> The next step consists of printing out the ASCII printable characters. 
> I have devised the following silliness:
> 
> printable = ' 
> 1!2 at 3#4$5%6^7&8*9(0)aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ\
> `~-_=+\\|[{]};:\'",<.>/?'

I'd use `string.printable` and remove the "invisible" characters like '\n'
or '\t'.

> for c in ln:
>      if c in printable: sys.stdout.write(c)
>      else: sys.stdout.write('.')
> 
> print
> 
> Which when following the list comprehension based code above, produces 
> the desired output:
> 
> 00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E  ... 456.89abcde~
> 
> I had considered using the .translate() method of strings, however this 
> would require a larger translation table than my printable string.

The translation table can be created once and should be faster.

> I'd like to display the non-printable characters differently, since they 
> can't be distinguished from genuine period '.' characters.  Thus, I may 
> use ANSI escape sequences like:
> 
> for c in ln:
>      if c in printable: sys.stdout.write(c)
>      else:
>          sys.stdout.write('\x1B[31m.')
>          sys.stdout.write('\x1B[0m')
> 
> print

`re.sub()` might be an option here.

> I'm also toying with the idea of showing hex bytes together with their
> ASCII representations, since I've often found it a chore to figure out
> which hex byte to change if I wanted to edit a certain ASCII char. Thus,
> I might display data something like this:
> 
> 00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b)
> 63(c) 64(d) 65(e) 7E(~)
> 
> Where printing chars are shown in parenthesis, characters with Python
> escape sequences will be shown as their escapes in parens., while
> non-printing chars with no escapes will be shown with nothing in parens.

For escaping:

In [90]: '\n'.encode('string-escape')
Out[90]: '\\n'

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list