Python-list Digest, Vol 36, Issue 33

David J Birnbaum djbpitt+python at pitt.edu
Sun Sep 3 10:17:19 EDT 2006


Dear John (cc python-list),
>> You may find (1) that the file has formfeeds in
>> it or (2) it has r"\f" in in it and you were mistaken about the
>> interpretation or (3) something else.
>>     
> ...
>   
>> Thank you for the quick response. Ultimately I need to remap the "f" in
>> "\f" to something else, so I worked around the problem by doing the
>> remapping first, and I'm now getting the desired result.
>>     
>
> Please reply on-list.
>
> How could you read the file to remap an "f" if you were getting '\0x0C'
> when you tried to read it? Are we to assume that it was case (2) i.e.
> not a Python problem?
>   
Possibly more than anyone else on-list cares to see, but it was case 
(3): I had misdiagnosed the input. The match was failing because I was 
reading the line improperly (when I remapped the "f" I was ... er ... 
inexplicably surprised when I couldn't find it, although it turned out 
to be there when I looked for the remapped value instead of for the 
original "f"). When I tried to troubleshoot it in an interpreter window, 
I misread the results, which is what prompted my inquiry on the list. 
Here's the intepreter diagnosis:

 >>> string1 = "blah \fR40\fC blah"
 >>> string1
'blah \x0cR40\x0cC blah'
 >>> string2 = "blah \\fR40\\fC blah"
 >>> string2
'blah \\fR40\\fC blah'

If I create a file that consists of:

<?xml version="1.0" encoding="UTF-8"?>
<test>
    <line>Hi, there</line>
    <line>blah \fR40\fC blah</line>
    <line>Hi, there</line>
</test>

And then a python script that reads:

import codecs
import re
file = open("test1.xml", "r")
nePat = re.compile("\\\\f.")
for line in file:
    print line
    print nePat.sub("TEST", line)
   
the relevant line comes out as:

<line>blah TEST40TEST blah</line>

which is what I want. That is, the script was, indeed, reading the 
character string correctly, as you suggested, and the substitution that 
I ran into during my test in the interpreter window was a red herring. 
Thanks again for the advice to look more closely.

Best,

David



More information about the Python-list mailing list