Why isn't my re.sub replacing the contents of my MS Word file?

Dave Angel davea at davea.name
Mon May 12 17:15:45 EDT 2014


On 05/12/2014 01:35 PM, scottcabit at gmail.com wrote:
> On Friday, May 9, 2014 8:12:57 PM UTC-4, Steven D'Aprano wrote:
>
>> Good:
>>
>>
>>
>>      # Untested
>>
>>      fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr)
>
>    Still doesn't work.
>
>    Guess whatever the code is for endash and mdash are not the ones I am using....
>

More likely, your MSWord document isn't a simple text file.  Some 
encodings don't resemble ASCII or Unicode in the least.

-- 
DaveA



More information about the Python-list mailing list