[Python-3000] setup.py fails in the py3k-struni branch

Guido van Rossum guido at python.org
Fri Jun 15 01:57:28 CEST 2007


On 6/13/07, Ron Adam <rrr at ronadam.com> wrote:
> Well I can see where a str8() type with an __incoded_with__ attribute could
> be useful.  It would use a bit more memory, but it won't be the
> default/primary string type anymore so maybe it's ok.
>
> Then bytes can be bytes, and unicode can be unicode, and str8 can be
> encoded strings for interfacing with the outside non-unicode world.  Or
> something like that. <shrug>

Hm... Requiring each str8 instance to have an encoding might be a
problem -- it means you can't just create one from a bytes object.
What would be the use of this information? What would happen on
concatenation? On slicing? (Slicing can break the encoding!)

> Attached both the str8 repr as s"..." and s'...', and the latest
> no_raw_escape patch which I think is complete now and should apply with no
> problems.

I like the str8 repr patch enough to check it in.

> I tracked the random fails I am having in test_tokenize.py down to it doing
> a round trip on random test_*.py files.  If one of those files has a
> problem it causes test_tokanize.py to fail also.  So I added a line to the
> test to output the file name it does the round trip on so those can be
> fixed as they are found.
>
> Let me know it needs to be adjusted or something doesn't look right.

Well, I'm still philosophically uneasy with r'\' being a valid string
literal, for various reasons (one being that writing a string parser
becomes harder and harder). I definitely want r'\u1234' to be a
6-character string, however. Do you have a patch that does just that?
(We can argue over the rest later in a larger forum.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list