encoding name mappings in codecs.py with email/charset.py

Stefanos Karasavvidis sk at isc.tuc.gr
Mon Dec 15 06:27:09 EST 2014


I played around with changing the names in the aliases.py and locale.py
files (from iso8859 to iso-88559), but this broke mailman.

I ended up changing the charset.py file

       input_charset = codecs.lookup(input_charset).name
            except LookupError:
                pass
        if (input_charset == 'iso8859-7'):
            input_charset = 'iso-8859-15
        if (input_charset == 'iso8859-15'):
            input_charset = 'iso-8859-7'
        if (input_charset == 'iso8859-1'):
            input_charset = 'iso-8859-1'


This seems to work for now.

I really wonder why I'm the only one with this problem. This should affect
all Mailman users with member on MS Exchange 2010 (at least) servers.
Exchange produces
CAT.InvalidContent.Exception: InvalidCharsetException, Character set name
(iso8859-7) is invalid or not installed.; cannot handle content of message
with...

Thanks gst for the input

sk

On Sun, Dec 14, 2014 at 9:53 PM, gst <g.starck at gmail.com> wrote:
>
> Le dimanche 14 décembre 2014 14:10:22 UTC-5, Stefanos Karasavvidis a
> écrit :
> > thanks for replying gst.
> >
> > I've thought already of patching the Charset class, but hoped for a
> cleaner solution.
> >
> >
> > This ALIASES dict has already all the iso names *with* a dash. So it
> must get striped somewhere else.
>
>
> not on my side, modifying this dict with the missing key-value apparently
> does what you want also :
>
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "copyright", "credits" or "license()" for more information.
> >>>
> >>> import email.charset
> >>> email.charset.ALIASES
> {'latin-8': 'iso-8859-14', 'latin-9': 'iso-8859-15', 'latin-2':
> 'iso-8859-2', 'latin-3': 'iso-8859-3', 'latin-1': 'iso-8859-1', 'latin-6':
> 'iso-8859-10', 'latin-7': 'iso-8859-13', 'latin-4': 'iso-8859-4',
> 'latin-5': 'iso-8859-9', 'euc_jp': 'euc-jp', 'latin-10': 'iso-8859-16',
> 'ascii': 'us-ascii', 'latin_10': 'iso-8859-16', 'latin_1': 'iso-8859-1',
> 'latin_2': 'iso-8859-2', 'latin_3': 'iso-8859-3', 'latin_4': 'iso-8859-4',
> 'latin_5': 'iso-8859-9', 'latin_6': 'iso-8859-10', 'latin_7':
> 'iso-8859-13', 'latin_8': 'iso-8859-14', 'latin_9': 'iso-8859-15', 'cp949':
> 'ks_c_5601-1987', 'euc_kr': 'euc-kr'}
> >>>
> >>> for i in range(1, 16):
>         c = 'iso-8859-' + str(i)
>         email.charset.ALIASES[c] = c
>
>
> >>>
> >>> iso7 = email.charset.Charset('iso-8859-7')
> >>> iso7
> iso-8859-7
> >>> str(iso7)
> 'iso-8859-7'
> >>>
>
> regards,
>
> gst.
>
> >
> > sk
> >
> >
> >
> > On Sun, Dec 14, 2014 at 7:21 PM, gst <g.st... at gmail.com> wrote:
> > Le vendredi 12 décembre 2014 04:21:14 UTC-5, Stefanos Karasavvidis a
> écrit :
> >
> > > I've hit a wall with mailman which seems to be caused by pyhon's
> character encoding names.
> >
> > >
> >
> > > I've narrowed the problem down to the email/charset.py file. Basically
> the following happens:
> >
> > >
> >
> >
> >
> > Hi,
> >
> >
> >
> > it's all in the email.charset.ALIASES dict.
> >
> >
> >
> > you could also simply patch the __str__ method of Charset :
> >
> >
> >
> > Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> >
> > [GCC 4.8.2] on linux2
> >
> > Type "copyright", "credits" or "license()" for more information.
> >
> > >>>
> >
> > >>> import email.charset
> >
> > >>>
> >
> > >>> c = email.charset.Charset('iso-8859-7')
> >
> > >>> str(c)
> >
> > 'iso8859-7'
> >
> > >>>
> >
> > >>> old = email.charset.Charset.__str__
> >
> > >>>
> >
> > >>> def patched(self):
> >
> >         r = old(self)
> >
> >         if r.startswith('iso'):
> >
> >                 return 'iso-' + r[3:]
> >
> >         return r
> >
> >
> >
> > >>>
> >
> > >>> email.charset.Charset.__str__ = patched
> >
> > >>>
> >
> > >>> str(c)
> >
> > 'iso-8859-7'
> >
> > >>>
> >
> >
> >
> >
> >
> > regards,
> >
> >
> >
> > gst.
> >
> > --
> >
> > https://mail.python.org/mailman/listinfo/python-list
> >
> >
> >
> >
> > --
> >
> >
> > ======================================================================
> > Stefanos Karasavvidis,  Electronic & Computer Engineer, M.Sc.
> > e-mail: s... at isc.tuc.gr, Tel.: (+30) 2821037508, Fax: (+30) 2821037520
> > Technical University of Crete, Campus, Building A1
> --
> https://mail.python.org/mailman/listinfo/python-list
>


-- 
======================================================================
Stefanos Karasavvidis,  Electronic & Computer Engineer, M.Sc.
<sk at isc.tuc.gr>e-mail: sk at isc.tuc.gr, Tel.: (+30) 2821037508, Fax: (+30)
2821037520
Technical University of Crete, Campus, Building A1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20141215/4ec90917/attachment.html>


More information about the Python-list mailing list