[issue29755] python3 gettext.lgettext sometimes returns bytes, not string

Serhiy Storchaka report at bugs.python.org
Sun Jun 18 07:35:52 EDT 2017


Serhiy Storchaka added the comment:

In Python 2 both gettext() and lgettext() are purposed to return 8-bit strings. The difference between them is only that gettext() encodes the translation back to the encoding of the translation file if the output encoding is not explicitly specified, while lgettext() encodes it to the preferred locale encoding. ugettext() returns Unicode strings.

In Python 3 ugettext() is renamed to gettext() and always returns Unicode strings. lgettext() should return a byte string, as in Python 2. The problem is that if the translation is not found, the untranslated message usually is returned, which is a Unicode string in Python 3. It should be encoded to a byte string, so that lgettext() always returns the same type -- bytes.

PR 2266 fixes lgettext() and related functions, updates the documentation, and adds tests.

Frankly, the usefulness of lgettext() in Python 3 looks questionable to me. gettext() can be used instead, with explicit encoding the result to the desired charset.

----------
nosy: +barry
stage:  -> patch review

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue29755>
_______________________________________


More information about the Python-bugs-list mailing list