unicode string literals and "u" prefix

nico nicolas.riesch at genevoise.ch
Tue Nov 9 10:41:18 EST 2004


Thank you a lot for your answer.

I understand better, now.
Nevertheless, all this unicode issue is quite confusing for beginners
( I started to learn Python two month ago... ).
And it seems that I am not the only one in this case.
In fact, I just came across this discussion of april 2003 "[Zope3-dev]
i18n, unicode, and the underline"
http://mail.zope.org/pipermail/zope3-dev/2003-April/006410.html.

Working for an insurance company, most of our data contain french
accented characters.
So, we are condemned to work essentially with unicode strings.
In fact, it is hard to find examples where plain ascii strings would
be useful in our case.
Even data we retrieve from databases are returned to us as unicode
strings.

That's why I tried to find a way to get rid of all those "u" prefixes
instead of systematically putting it in front of each unicode string
litteral, which is somewhat "noisy".
That's also because I am afraid that sometime someone will forget this
"u" prefix, and errors will be detected in a far more later stage, or
too late.
A way of defaulting all string literal as unicode would have been a
relief.

It would be good if we could just write a declaration at the beginning
of the source file like
  # strings_are_unicode_by_default
We would write unicode strings without "u" prefix like this:
  s="élément"
and if we really must have plain ascii strings, we could explicitely
prefix them with "a", for instance s=a"my plain ascii string".
Thus, everybody would be happy, and there will be no incidence about
all the already written codes or librairies.
But there must be issues I am not aware of, I suppose...

I think you have the same problem when you write strings in german
language.
But if it is no problem for you to prefix your strings with "u" like
in :
  s=u"Vielen Dank für Ihre Antwort"
then we can live with it too, for the next twenty years.

Sometimes, I feel like an ethnical minority, when I see in a
well-known book about Python that "Because Unicode is a relatively
advanced and rarely used tool, we will omit further details in this
introductory text."
Working in a language with accented characters is definitively bad
luck.

Freundliche Grüsse

Nicolas Riesch



More information about the Python-list mailing list