[Tutor] Puzzled again

Wed Aug 3 19:48:09 CEST 2011

On Wed, Aug 3, 2011 at 10:11, Peter Otten <__peter__ at web.de> wrote:
> Richard D. Moores wrote:
>
>> I wrote before that I had pasted the function (convertPath()) from my
>> initial post into mycalc.py because I had accidentally deleted it from
>> mycalc.py. And that there was no problem importing it from mycalc.
>> Well, I was mistaken (for a reason too tedious to go into). There WAS
>> a problem, the same one as before.
>
> Dave was close, but Steven hit the nail: the string r"C:\Users\Dick\..." is
> fine, but when you put it into the docstring it is not a raw string within
> another string, it becomes just a sequence of characters that is part of the
> outer string. As such \U marks the beginning of a special way to define a
> unicode codepoint:
>
>>>> "\U00000041"
> 'A'
>
> As "sers\Dic", the eight characters following the \U in your docstring, are
> not a valid hexadecimal number you get an error message.
>
> The solution is standard procedure: escape the backslash or use a rawstring:
>
> Wrong:
>
>>>> """yadda r"C:\Users\Dick\..." yadda"""
>  File "<stdin>", line 1
> SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in
> position 10-12: truncated \UXXXXXXXX escape
>
> Correct:
>
>>>> """yadda r"C:\\Users\Dick\..." yadda"""
> 'yadda r"C:\\Users\\Dick\\..." yadda'
>
> Also correct:
>
>>>> r"""yadda r"C:\Users\Dick\..." yadda"""
> 'yadda r"C:\\Users\\Dick\\..." yadda'

Here's from my last post:

====================================
Now I edit it back to its original problem form:

def convertPath(path):
   """
   Given a path with backslashes, return that path with forward slashes.

   By Steven D'Aprano  07/31/2011 on Tutor list
   >>> path = r'C:\Users\Dick\Desktop\Documents\Notes\College Notes.rtf'
   >>> convertPath(path)
   'C:/Users/Dick/Desktop/Documents/Notes/College Notes.rtf'
   """
   import os.path
   separator = os.path.sep
   if separator != '/':
       path = path.replace(os.path.sep, '/')
   return path

and get

C:\Windows\System32>python
Python 3.2.1 (default, Jul 10 2011, 20:02:51) [MSC v.1500 64 bit
(AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import mycalc2
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "C:\Python32\lib\site-packages\mycalc2.py", line 10
   """
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 144-146: truncated \UXXXXXXX
X escape

Using HxD, I find that the bytes in 144-146 are 20, 54, 75 or  the
<space>, 'T', 'u' of  " Tutor" .  A screen shot of HxD with this
version of mycalc2.py open in it is at
<http://www.rcblue.com/images/HxD.jpg>. You can see that I believe the
offset integers are base-10 ints. I do hope that's correct, or I've
done a lot of work for naught.
====================================

So have I not used HxD correctly (my first time to use a hex reader)?
If I have used it correctly, why do the reported problem offsets of
144-146 correspond to such innocuous things as 'T', 'u' and <space>,
and which come BEFORE the problems you and Steven point out?

Dick