unicode .replace not working - why?

Peter Otten __peter__ at web.de
Sun Oct 12 04:00:26 EDT 2008


Kurt Peters wrote:

> I had done that about 21 revisions ago.  

If you litter your module with code that is commented out it is hard to keep
track of what works and what doesn't.

> Nevertheless, why would you think 
> that would work, when the code as shown doesn't?

Because he knows Python? Why don't /you/ try it before asking that question?

A good place to do "exploratory" programming is Python's interactive
interpreter. Here's a sample session:

Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:43)
[GCC 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from pyPdf import PdfFileReader as PFR
>>> doc = PFR(open("SUA.pdf"))
>>> text = doc.getPage(3).extractText()
>>> type(text)
<type 'unicode'>
>>> text[:200]
u'2/16/08                7400.8P Table of Contents - Continued  Section Page  
\                                   xa773.49  New Hampshire (NH) 50
\xa773.50  New Jersey (NJ) 50 \xa773.51  New Mex                                  
ico (NM) 51 \xa773.52  New York (NY) 56 \xa773.53  North '
>>> print text[:200].replace(u"\xa7", u"\n")
2/16/08                7400.8P Table of Contents - Continued  Section Page
73.49  New Hampshire (NH) 50
73.50  New Jersey (NJ) 50
73.51  New Mexico (NM) 51
73.52  New York (NY) 56
73.53  North

Peter



More information about the Python-list mailing list