codec for UTF-8 with BOM

Peter Otten __peter__ at web.de
Mon May 2 07:42:55 EDT 2011


Ulrich Eckhardt wrote:

> Chris Rebert wrote:
>>> 3. The docs mention encodings.utf_8_sig, available since 2.5, but I
>>> can't locate that thing there either. What's going on here?
>> 
>> Works for me™:
>> Python 2.6.6 (r266:84292, Jan 12 2011, 13:35:00)
>> [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> from encodings import utf_8_sig
>>>>>
> 
> This works for me, too. What I tried and what failed was
> 
>   import encodings
>   encodings.utf_8_sig
> 
> which raises an AttributeError or dir(encodings), which doesn't show the
> according element. If I do it your way, the encoding then shows up in the
> content of the module.
> 
> Apart from the encoding issue, I don't understand this behaviour. Is the
> module behaving badly or are my expectations simply flawed?

This is standard python package behaviour:

>>> import logging
>>> logging.handlers
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'handlers'
>>> import logging.handlers
>>> logging.handlers
<module 'logging.handlers' from '/usr/lib/python2.6/logging/handlers.pyc'>

You wouldn't see the AttributeError only if encodings/__init__.py contained 
a line

from . import utf_8_sig

or similar. The most notable package that acts this way is probably os which 
eagerly imports a suitable path module depending on the platform.

As you cannot foresee which encodings are actually needed in a script it 
makes sense to omit a just-in-case import.



More information about the Python-list mailing list