[New-bugs-announce] [issue15462] UTF8 BOM incorrectly prepended syslog messages when using rsysolog

Aimon Bustardo report at bugs.python.org
Thu Jul 26 23:06:21 CEST 2012


New submission from Aimon Bustardo <abustardo at morphlabs.com>:

Ubuntu 12.0.4 LTS 64bit
python2.7-minimal 2.7.3-0ubuntu3
rsyslog 5.8.6-1ubuntu8

Python converts all syslog messages to UTF8 before sending to syslog. It also prepends the Byte Order Mark (BOM) of the Unicode Standard. This prepended BOM causes bad characters when using rsyslog (have not verified with std syslog or syslog-ng).

Example log line:

Jul 25 13:36:03 mc 2012-07-25 13:36:03 INFO nova.api.openstack.wsgi [req-48a555a5-6d2a-4a38-8384-3b4684357e72 19f932a5b0b34655989f4cb761522bb3 2617e657fdf84569a6be7977318e46c8] http://MASKED:8774/v1.1/2617e657fdf84569a6be7977318e46c8/os-hosts/MASKED.json?ignore_awful_caching1343248563 returned with HTTP 200

Note the ' ' before the date field.

Interesting find on issues from another site:

"Yes, "" is the Byte Order Mark (BOM) of the Unicode Standard. Specifically it is the hex bytes EF BB BF, which form the UTF-8 representation of the BOM, misinterpreted as ISO 8859/1 text instead of UTF-8."

If I patch the code in /usr/lib/python2.7/logging/handlers.py:
------------------------------------------
@@ -797,9 +797,10 @@
                                             self.mapPriority(record.levelname))
         # Message is a string. Convert to bytes as required by RFC 5424
         if type(msg) is unicode:
            msg = msg.encode('utf-8')
- if codecs:
- msg = codecs.BOM_UTF8 + msg
+ #if codecs:
+ # msg = codecs.BOM_UTF8 + msg
         msg = prio + msg
         try:
             if self.unixsocket:

----------------------------------------

The logs will now appear normally. What is happening with the 'codecs' condition? Is this controllable through config? Is this a bug in rsyslog? 

Related tickets:

https://bugs.launchpad.net/openstack-common/+bug/1029116
https://bugs.launchpad.net/ubuntu/+source/python2.7/+bug/1029640
http://bugzilla.adiscon.com/show_bug.cgi?id=346

----------
components: IO, Library (Lib), Unicode
messages: 166520
nosy: Aimon.Bustardo, ezio.melotti
priority: normal
severity: normal
status: open
title: UTF8 BOM incorrectly prepended syslog messages when using rsysolog
type: behavior
versions: Python 2.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15462>
_______________________________________


More information about the New-bugs-announce mailing list