[Mailman-Users] Encoding issues when importing archives

Mark Sapiro mark at msapiro.net
Tue May 15 08:52:02 EDT 2018


On 5/14/18 2:16 PM, Eric Abrahamsen wrote:
> I'm recreating some old lists I had in a Mailman 2 installation, and
> trying to import the old mboxes into Hyperkitty.


This is not the appropriate list for Mailman 3.
mailman-users at mailman3.org
<https://lists.mailman3.org/mailman3/lists/mailman-users@mailman3.org/>
or possibly mailman-developers at python3.org
<https://mail.python.org/mailman/listinfo/mailman-developers> are the
appropriate lists.


> The lists were on Chinese-related subjects, and we've got both messages
> that contain Chinese characters, and attachments that have Chinese
> filenames and contents.
> 
> The import process is blowing up with a UnicodeEncodeError, in
> hyperkitty/lib/incoming.py#add_to_list, it looks like when the
> attachments are being processed:
> 
> content = content.encode(decoding)
> 
> UnicodeEncodeError: 'gb2312' codec can't encode character '\ufffd' in position 3131: illegal multibyte sequence
> 
> Apparently the offending attachments are specified as gb2312 (a common
> Chinese encoding).
> 
> Is there something I can do to somehow preprocess the archive mboxes, or
> otherwise re-encode the attachments?


Possibly there is, but this is a bug in the hyperkitty_import process.
It would help if you file an issue at
<https://gitlab.com/mailman/hyperkitty/issues/new> with enough
information for us to reproduce it.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list