Hex editor display - can this be more pythonic?

CC crobc at BOGUS.sbcglobal.net
Sun Jul 29 15:24:56 EDT 2007


Hi:

I'm building a hex line editor as a first real Python programming exercise.

Yesterday I posted about how to print the hex bytes of a string.  There 
are two decent options:

ln = '\x00\x01\xFF 456\x0889abcde~'
import sys
for c in ln:
     sys.stdout.write( '%.2X ' % ord(c) )

or this:

sys.stdout.write( ' '.join( ['%.2X' % ord(c) for c in ln] ) + '  ' )

Either of these produces the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E

I find the former more readable and simpler.  The latter however has a 
slight advantage in not putting a space at the end unless I really want 
it.  But which is more pythonic?

The next step consists of printing out the ASCII printable characters. 
I have devised the following silliness:

printable = ' 
1!2 at 3#4$5%6^7&8*9(0)aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ\
`~-_=+\\|[{]};:\'",<.>/?'
for c in ln:
     if c in printable: sys.stdout.write(c)
     else: sys.stdout.write('.')

print

Which when following the list comprehension based code above, produces 
the desired output:

00 01 FF 20 34 35 36 08 38 39 61 62 63 64 65 7E  ... 456.89abcde~

I had considered using the .translate() method of strings, however this 
would require a larger translation table than my printable string.  I 
was also using the .find() method of the printable string before 
realizing I could use 'in' here as well.

I'd like to display the non-printable characters differently, since they 
can't be distinguished from genuine period '.' characters.  Thus, I may 
use ANSI escape sequences like:

for c in ln:
     if c in printable: sys.stdout.write(c)
     else:
         sys.stdout.write('\x1B[31m.')
         sys.stdout.write('\x1B[0m')

print


I'm also toying with the idea of showing hex bytes together with their 
ASCII representations, since I've often found it a chore to figure out 
which hex byte to change if I wanted to edit a certain ASCII char. 
Thus, I might display data something like this:

00(\0) 01() FF() 20( ) 34(4) 35(5) 36(6) 08(\b) 38(8) 39(9) 61(a) 62(b) 
63(c) 64(d) 65(e) 7E(~)

Where printing chars are shown in parenthesis, characters with Python 
escape sequences will be shown as their escapes in parens., while 
non-printing chars with no escapes will be shown with nothing in parens.

Or perhaps a two-line output with offset addresses under the data.  So 
many possibilities!


Thanks for input!




-- 
_____________________
Christopher R. Carlen
crobc at bogus-remove-me.sbcglobal.net
SuSE 9.1 Linux 2.6.5



More information about the Python-list mailing list