encoding name mappings in codecs.py with email/charset.py
gst
g.starck at gmail.com
Sun Dec 14 14:53:41 EST 2014
Le dimanche 14 décembre 2014 14:10:22 UTC-5, Stefanos Karasavvidis a écrit :
> thanks for replying gst.
>
> I've thought already of patching the Charset class, but hoped for a cleaner solution.
>
>
> This ALIASES dict has already all the iso names *with* a dash. So it must get striped somewhere else.
not on my side, modifying this dict with the missing key-value apparently does what you want also :
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "copyright", "credits" or "license()" for more information.
>>>
>>> import email.charset
>>> email.charset.ALIASES
{'latin-8': 'iso-8859-14', 'latin-9': 'iso-8859-15', 'latin-2': 'iso-8859-2', 'latin-3': 'iso-8859-3', 'latin-1': 'iso-8859-1', 'latin-6': 'iso-8859-10', 'latin-7': 'iso-8859-13', 'latin-4': 'iso-8859-4', 'latin-5': 'iso-8859-9', 'euc_jp': 'euc-jp', 'latin-10': 'iso-8859-16', 'ascii': 'us-ascii', 'latin_10': 'iso-8859-16', 'latin_1': 'iso-8859-1', 'latin_2': 'iso-8859-2', 'latin_3': 'iso-8859-3', 'latin_4': 'iso-8859-4', 'latin_5': 'iso-8859-9', 'latin_6': 'iso-8859-10', 'latin_7': 'iso-8859-13', 'latin_8': 'iso-8859-14', 'latin_9': 'iso-8859-15', 'cp949': 'ks_c_5601-1987', 'euc_kr': 'euc-kr'}
>>>
>>> for i in range(1, 16):
c = 'iso-8859-' + str(i)
email.charset.ALIASES[c] = c
>>>
>>> iso7 = email.charset.Charset('iso-8859-7')
>>> iso7
iso-8859-7
>>> str(iso7)
'iso-8859-7'
>>>
regards,
gst.
>
> sk
>
>
>
> On Sun, Dec 14, 2014 at 7:21 PM, gst <g.st... at gmail.com> wrote:
> Le vendredi 12 décembre 2014 04:21:14 UTC-5, Stefanos Karasavvidis a écrit :
>
> > I've hit a wall with mailman which seems to be caused by pyhon's character encoding names.
>
> >
>
> > I've narrowed the problem down to the email/charset.py file. Basically the following happens:
>
> >
>
>
>
> Hi,
>
>
>
> it's all in the email.charset.ALIASES dict.
>
>
>
> you could also simply patch the __str__ method of Charset :
>
>
>
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
>
> [GCC 4.8.2] on linux2
>
> Type "copyright", "credits" or "license()" for more information.
>
> >>>
>
> >>> import email.charset
>
> >>>
>
> >>> c = email.charset.Charset('iso-8859-7')
>
> >>> str(c)
>
> 'iso8859-7'
>
> >>>
>
> >>> old = email.charset.Charset.__str__
>
> >>>
>
> >>> def patched(self):
>
> r = old(self)
>
> if r.startswith('iso'):
>
> return 'iso-' + r[3:]
>
> return r
>
>
>
> >>>
>
> >>> email.charset.Charset.__str__ = patched
>
> >>>
>
> >>> str(c)
>
> 'iso-8859-7'
>
> >>>
>
>
>
>
>
> regards,
>
>
>
> gst.
>
> --
>
> https://mail.python.org/mailman/listinfo/python-list
>
>
>
>
> --
>
>
> ======================================================================
> Stefanos Karasavvidis, Electronic & Computer Engineer, M.Sc.
> e-mail: s... at isc.tuc.gr, Tel.: (+30) 2821037508, Fax: (+30) 2821037520
> Technical University of Crete, Campus, Building A1
More information about the Python-list
mailing list