[Python-bugs-list] Re: [Bug #119755] sys.getdefaultencoding undocumented

M.-A. Lemburg mal@lemburg.com
Mon, 30 Oct 2000 17:42:00 +0100


Alex Martelli wrote:
> 
> > Summary: sys.getdefaultencoding undocumented
> >
> > Details: I can't find any documentation of the getdefaultencoding function
> of module sys (though I can guess what it does...), nor any explanation
> about how to set/change the default encoding (or assertion that it cannot be
> changed).
>     [snip]
> > Since the default encoding is 'ascii' unless one edits site.py or
> > sitecustomize.py to use something else, there is not much to document
> 
> But what should be placed in site.py or sitecustomize.py to
> use a different default-encoding rather than 'ascii'...?  I
> can't see that documented anywhere, either (myopia...?)

Here's what I wrote for the 2.0 docs (this will in some form
appear in the next docs release AFAIK):

"""
locale.py:

    getdefaultlocale(envvars=('LANGUAGE', 'LC_ALL', 'LC_CTYPE', 'LANG')) :
        Tries to determine the default locale settings and returns
        them as tuple (language code, encoding).
        
        According to POSIX, a program which has not called
        setlocale(LC_ALL, "") runs using the portable 'C' locale.
        Calling setlocale(LC_ALL, "") lets it use the default locale as
        defined by the LANG variable. Since we don't want to interfere
        with the current locale setting we thus emulate the behavior
        in the way described above.
        
        To maintain compatibility with other platforms, not only the
        LANG variable is tested, but a list of variables given as
        envvars parameter. The first found to be defined will be
        used. envvars defaults to the search path used in GNU gettext;
        it must always contain the variable name 'LANG'.
        
        Except for the code 'C', the language code corresponds to RFC
        1766.  code and encoding can be None in case the values cannot
        be determined.

    [new in 2.0b1]

    getlocale(category=0) :
        Returns the current setting for the given locale category as
        tuple (language code, encoding).
        
        category may be one of the LC_* value except LC_ALL. It
        defaults to LC_CTYPE.
        
        Except for the code 'C', the language code corresponds to RFC
        1766.  code and encoding can be None in case the values cannot
        be determined.

    [new in 2.0b1]

    normalize(localename) :
        Returns a normalized locale code for the given locale
        name.
        
        The returned locale code is formatted for use with
        setlocale().
        
        If normalization fails, the original name is returned
        unchanged.
        
        If the given encoding is not known, the function defaults to
        the default encoding for the locale code just like setlocale()
        does.

    [new in 2.0b1]

    resetlocale(category=6) :
        Sets the locale for category to the default setting.
        
        The default setting is determined by calling
        getdefaultlocale(). category defaults to LC_ALL.

    [new in 2.0b1]

    setlocale(category, locale=None) :
        Set the locale for the given category.  The locale can be
        a string, a locale tuple (language code, encoding), or None.
        
        Locale tuples are converted to strings the locale aliasing
        engine.  Locale strings are passed directly to the C lib.
        
        category may be given as one of the LC_* values.

    [changed in 2.0b1 to also accept a tuple as input to category]

These APIs can be put to use in the site.py file. It already
contains template code which sets the default encoding depending
on the current default locale.

New in 2.0b1 are also the related sys module APIs:

        getdefaultencoding() -> string
        
        Return the current default string encoding used by the Unicode 
        implementation.

        setdefaultencoding(encoding)

        Set the current default string encoding used by the Unicode
        implementation. Only available in site.py.
"""

> > here ;-)
> >
> > Anyway, documentation was already written and will show up with
> > the next documentation release, I guess.... Fred ?
> 
> I, personally, can wait, but there's some guy over in c.l.p
> who's dying to get latin1 as the default, it seems.

He will only have to uncomment the relevant code in site.py:

"""
# Set the string encoding used by the Unicode implementation.  The
# default is 'ascii', but if you're willing to experiment, you can
# change this.

encoding = "ascii" # Default value set by _PyUnicode_Init()

if 0:
    # Enable to support locale aware default string encodings.
    import locale
    loc = locale.getdefaultlocale()
    if loc[1]:
        encoding = loc[1]

if 0:
    # Enable to switch off string to Unicode coercion and implicit
    # Unicode to string conversion.
    encoding = "undefined"

if encoding != "ascii":
    sys.setdefaultencoding(encoding)
"""

Feel free to forward the docs to c.l.p. It's really not that
hard to use .setdefaultencoding() et al. :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/