Problem with -3 switch

Carl Banks pavlovevidence at gmail.com
Mon Jan 12 19:00:19 EST 2009


On Jan 12, 5:03 pm, John Machin <sjmac... at lexicon.net> wrote:
> On Jan 13, 6:12 am, Christian Heimes <li... at cheimes.de> wrote:
>
> > > I say again, show me a case of working 2.5 code where prepending u to
> > > an ASCII string constant that is intended to be used in a text context
> > > is actually worth the keystrokes.
>
> > Eventually you'll learn it the hard way. *sigh*
>
> And the hard way involves fire and brimstone, together with weeping,
> wailing and gnashing of teeth, correct? Hmmm, let's see. Let's take
> Carl's example of the sinner who didn't decode the input: """ Someone
> found that urllib.open() returns a bytes object in Python 3.0, which
> messed him up since in 2.x he was running regexp searches on the
> output.  If he had been taking care to use only unicode objects in 2.x
> (in this case, by explicitly decoding the output) then it wouldn't
> have been an issue. """
>
> 3.0 says:
> | >>> re.search("foo", b"barfooble")
> | Traceback (most recent call last):
> |   File "<stdin>", line 1, in <module>
> |   File "C:\python30\lib\re.py", line 157, in search
> |     return _compile(pattern, flags).search(string)
> | TypeError: can't use a string pattern on a bytes-like object
>
> The problem is diagnosed at the point of occurrence with a 99%-OK
> exception message. Why only 99%? Because it only vaguely hints at this
> possibility:
> | >>> re.search(b"foo", b"barfooble")
> | <_sre.SRE_Match object at 0x00FACD78>
>
> Obvious solution (repent and decode):
> | >>> re.search("foo", b"barfooble".decode('ascii'))
> | <_sre.SRE_Match object at 0x00FD86B0>
>
> This is "messed him up"?

If you believe in waiting for bugs to occur and then fixing them,
rather than programming to avoid bugs, there is no helping you.

P.S. The "obvious" solution is wrong, although I'm not sure if you
were making some kind of ironic point.


Carl Banks



More information about the Python-list mailing list