[Python-Dev] Readability of hex strings (Was: Use of coding cookie in 3.x stdlib)

Mon Jul 26 20:42:07 CEST 2010

[+Python-ideas -Python-Dev]

import binascii
def h(s):
  return binascii.unhexlify("".join(s.split()))

h("DE AD BE EF CA FE BA BE")

-- Alexandre

On Mon, Jul 26, 2010 at 11:29 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> I find "\xXX\xXX\xXX\xXX..." notation for binary data totally
> unreadable. Everybody who uses and analyses binary data is more
> familiar with plain hex dumps in the form of "XX XX XX XX...".
>
> I wonder if it is possible to introduce an effective binary string
> type that will be represented as h"XX XX XX" in language syntax? It
> will be much easier to analyze printed binary data and copy/paste such
> data as-is from hex editors/views.
>
> On Mon, Jul 19, 2010 at 9:45 AM, Guido van Rossum <guido at python.org> wrote:
>> Sounds like a good idea to try to remove redundant cookies *and* to
>> remove most occasional use of non-ASCII characters outside comments
>> (except for unittests specifically trying to test Unicode features).
>> Personally I would use \xXX escapes instead of spelling out the
>> characters in shlex.py, for example.
>>
>> Both with or without the coding cookies, many ways of displaying text
>> files garble characters outside the ASCII range, so it's better to
>> stick to ASCII as much as possible.
>>
>> --Guido
>>
>> On Mon, Jul 19, 2010 at 1:21 AM, Alexander Belopolsky
>> <alexander.belopolsky at gmail.com> wrote:
>>> I was looking at the inspect module and noticed that it's source
>>> starts with "# -*- coding: iso-8859-1 -*-".   I have checked and there
>>> are no non-ascii characters in the file.   There are several other
>>> modules that still use the cookie:
>>>
>>> Lib/ast.py:# -*- coding: utf-8 -*-
>>> Lib/getopt.py:# -*- coding: utf-8 -*-
>>> Lib/inspect.py:# -*- coding: iso-8859-1 -*-
>>> Lib/pydoc.py:# -*- coding: latin-1 -*-
>>> Lib/shlex.py:# -*- coding: iso-8859-1 -*-
>>> Lib/encodings/punycode.py:# -*- coding: utf-8 -*-
>>> Lib/msilib/__init__.py:# -*- coding: utf-8 -*-
>>> Lib/sqlite3/__init__.py:#-*- coding: ISO-8859-1 -*-
>>> Lib/sqlite3/dbapi2.py:#-*- coding: ISO-8859-1 -*-
>>> Lib/test/bad_coding.py:# -*- coding: uft-8 -*-
>>> Lib/test/badsyntax_3131.py:# -*- coding: utf-8 -*-
>>>
>>> I understand that coding: utf-8 is strictly redundant in 3.x.  There
>>> are cases such as Lib/shlex.py where using encoding other than utf-8
>>> is justified.  (See
>>> http://svn.python.org/view?view=rev&revision=82560).  What are the
>>> guidelines for other cases?  Should redundant cookies be removed?
>>> Since not all editors respect the  -*- cookie, I think the answer
>>> should be "yes" particularly when the cookie is setting encoding other
>>> than utf-8.
>>> _______________________________________________
>>> Python-Dev mailing list
>>> Python-Dev at python.org
>>> http://mail.python.org/mailman/listinfo/python-dev
>>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com
>>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexandre%40peadrop.com
>