Help with character encodings

Gary Herron gherron at islandtraining.com
Tue May 20 11:28:41 EDT 2008


A_H wrote:
> Help!
>
> I've scraped a PDF file for text and all the minus signs come back as
> u'\xad'.
>
> Is there any easy way I can change them all to plain old ASCII '-' ???
>
> str.replace complained about a missing codec.
>
>
>
> Hints?
>   

Encoding it into a 'latin1' encoded string seems to work:

  >>> print u'\xad'.encode('latin1')
  -




>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>   




More information about the Python-list mailing list