[Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)

Matej Cepl mcepl at redhat.com
Mon May 6 00:01:56 CEST 2013


----- Original Message -----
> From: "Armin Rigo" <arigo at tunes.org>
> To: "Matej Cepl" <mcepl at redhat.com>
> Cc: python-dev at python.org
> Sent: Saturday, May 4, 2013 11:59:42 AM
> Subject: Re: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
> 
> Hi Matej,
> 
> On Thu, Mar 7, 2013 at 11:08 AM, Matej Cepl <mcepl at redhat.com> wrote:
> >          if c is not ' ' and c is not '  ':
> >             if c != ' ' and c != ' ':
> 
> Sorry for the delay in answering, but I just noticed what is wrong in
> this "fix": it compares c with the same single-character ' ' twice,
> whereas the original compared it with ' ' and with the two-character '

Comments on https://github.com/mcepl/html2text/commit/f511f3c78e60d7734d677f8945580f52ef7ef742#L0R765 (perhaps in https://github.com/aaronsw/html2text/pull/77) are more than welcome. When using

SPACE_RE = re.compile(r'\s\+') 

for checking, whole onlywhite function is not needed anymore (and it still made me wonder what Aaron meant when he wrote it). Why line.isspace() doesn't work is weird though.

Best,

Matěj


More information about the Python-Dev mailing list