Problem with -3 switch

Mon Jan 12 06:26:58 EST 2009

On Jan 12, 7:29 pm, Carl Banks <pavlovevide... at gmail.com> wrote:
> On Jan 12, 12:32 am, John Machin <sjmac... at lexicon.net> wrote:
>
>
>
>
>
> > On Jan 12, 12:23 pm, Carl Banks <pavlovevide... at gmail.com> wrote:
>
> > > On Jan 9, 6:11 pm, John Machin <sjmac... at lexicon.net> wrote:
>
> > > > On Jan 10, 6:58 am, Carl Banks <pavlovevide... at gmail.com> wrote:

> > > > > I expect that it'd be a PITA in some cases to use the transitional
> > > > > dialect (like getting all your Us in place), but that doesn't mean the
> > > > > language is crippled.
>
> > > > What is this "transitional dialect"? What does "getting all your Us in
> > > > place" mean?
>
> > > Transitional dialect is the subset of Python 2.6 that can be
> > > translated to Python3 with 2to3 tool.
>
> > I'd never seen it called "transitional dialect" before.
>
> I had hoped the context would make it clear what I was talking about.

In vain.

>
> > >  Getting all your Us in place
> > > refers to prepending a u to strings to make them unicode objects,
> > > which is something 2to3 users are highly advised to do to keep hassles
> > > to a minimum.  (Getting Bs in place would be a good idea too.)
>
> > Ummm ... I'm not understanding something. 2to3 changes u"foo" to
> > "foo", doesn't it? What's the point of going through the code and
> > changing all non-binary "foo" to u"foo" only so that 2to3 can rip the
> > u off again?
>
> It does a bit more than that.

Like what?

>
> > What hassles? Who's doing the highly-advising where and
> > with what supporting argument?
>
> You add the u so the the constant will be the same data type in 2.6 as
> it becomes in 3.0 after applying 2to3.  str and unicode objects aren't
> always with smooth with each other, and you have a much better chance
> of getting the same behavior in 2.6 and 3.0 if you use an actual
> unicode string in both.

(1) Why specifically 2.6? Do you mean 2.X, or is this related to the
"port to 2.6 first" theory?
(2) We do assume we are starting off with working 2.X code, don't we?
If we change "foo" to u"foo" and get a different answer from the 2.X
code, is that still "working"?

>
> A example of this, though not with string constants,

And therefore irrelevant.

I would like to hear from someone who has actually started with
working 2.x code and changed all their text-like "foo" to
u"foo" [except maybe unlikely suspects like open()'s mode arg]:
* how many places where the 2.x code broke and so did the 3.x code
[i.e. the problem would have been detected without prepending u]
* how many places where the 2.x code broke but the 3.x code didn't
[i.e. prepending u did find the problem]
* whether they thought it was worth the effort

In the meantime I would be interested to hear from anybody with a made-
up example of code where the problem would be detected (sooner |
better | only) by prepending u to text-like string constants.

> 2to3 can only do so
> much; it can't always guess whether your string usage is supposed to
> be character or binary.

AFAICT it *always* guesses text rather than binary; do you have any
examples where it guesses binary (rightly or wrongly)?