[Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)

Victor Stinner victor.stinner at gmail.com
Thu Mar 7 14:34:25 CET 2013


You should try to write a simple test not using your library (just
copy/paste code) reproducing the issue. If you can do that, please
fill an issue on bugs.python.org.

Victor

2013/3/7 Matej Cepl <mcepl at redhat.com>:
> On 2013-03-06, 18:34 GMT, Victor Stinner wrote:
>> In short, Unicode was rewritten in Python 3.3 for the PEP 393. It's
>> not surprising that minor details like singleton differ. You should
>> not use "is" to compare strings in Python, or your program will fail
>> on other Python implementations (like PyPy, IronPython, or Jython) or
>> even on a different CPython version.
>
> I am sorry, I don't understand what you are saying. Even though
> this has been changed to
> https://github.com/mcepl/html2text/blob/fix_tests/html2text.py#L90
> the tests still fail.
>
> But, Amaury is right: the function doesn't make much sense.
> However, ...
>
> when I have “fixed it” from
> https://github.com/mcepl/html2text/blob/master/html2text.py#L95
>
> def onlywhite(line):
>      """Return true if the line does only consist of whitespace characters."""
>      for c in line:
>          if c is not ' ' and c is not '  ':
>              return c is ' '
>      return line
>
> to
> https://github.com/mcepl/html2text/blob/fix_tests/html2text.py#L90
>
> def onlywhite(line):
>      """Return true if the line does only consist of whitespace
>      characters."""
>          for c in line:
>             if c != ' ' and c != ' ':
>                return c == ' '
>          return line
>
> tests on ALL versions of Python are suddenly failing ...
> https://travis-ci.org/mcepl/html2text/builds/5288190
>
> Curiouser and curiouser! At least, I seem to have the point,
> where things are breaking, but I have to admit that condition
> really doesn’t make any sense to me.
>
>> Anyway, you spotted a missed optimization: it's now "fixed" in
>> Python 3.3 and 3.4 by the following commits.
>
> Well, whatever is the problem, it is not fixed in python 3.3.0
> (as you can see in
> https://travis-ci.org/mcepl/html2text/builds/4969045) as I can
> see on my computer. Actually, good news is that it seems to be
> fixed in the master branch of cpython (or the tip, as they say in
> the Mercurial world).
>
> Any thoughts?
>
> Matěj
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com


More information about the Python-Dev mailing list