En installant (via les sources avec pip install -e .
dans un venv) la version 1.3.3 de hyperkitty, j’arrive effectivement plus loin:
# python3 manage.py hyperkitty_import --list-address spip-dev@mailman.the.re --since 2000-01-01 --no-sync-mailman /home/debian/spip-dev.mbox/*.mbox
...
Importing from mbox file /home/debian/spip-dev.mbox/200211.mbox to spip-dev@mailman.the.re
Importing from mbox file /home/debian/spip-dev.mbox/200212.mbox to spip-dev@mailman.the.re
/Failed adding message <3DF814AB.4040802@free.fr>: unknown encoding: #charset
Importing from mbox file /home/debian/spip-dev.mbox/200301.mbox to spip-dev@mailman.the.re
\Traceback (most recent call last):
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2398, in parse_mime_parameters
token, value = get_parameter(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2255, in get_parameter
token, value = get_attribute(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2143, in get_attribute
"expected token but found '{}'".format(value))
email.errors.HeaderParseError: expected token but found '=?ISO-8859-1?Q?coordonn=E9_pou?==?ISO-8859-1?Q?r_le_contr=F4le_officiel
_des_denr=E9es_alimen?==?ISO-8859-1?Q?taires_pour_2003?="'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/home/debian/venv-hyperkitty/lib/python3.7/site-packages/django/core/management/__init__.py", line 401, in execute_from_
command_line
utility.execute()
File "/home/debian/venv-hyperkitty/lib/python3.7/site-packages/django/core/management/__init__.py", line 395, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/debian/venv-hyperkitty/lib/python3.7/site-packages/django/core/management/base.py", line 328, in run_from_argv
self.execute(*args, **cmd_options)
File "/home/debian/venv-hyperkitty/lib/python3.7/site-packages/django/core/management/base.py", line 369, in execute
output = self.handle(*args, **options)
File "/home/debian/hyperkitty/hyperkitty/management/commands/hyperkitty_import.py", line 327, in handle
importer.from_mbox(mbfile)
File "/home/debian/hyperkitty/hyperkitty/management/commands/hyperkitty_import.py", line 158, in from_mbox
message = message_from_bytes(msg_raw, policy=policy.default)
File "/usr/lib/python3.7/email/__init__.py", line 46, in message_from_bytes
return BytesParser(*args, **kws).parsebytes(s)
File "/usr/lib/python3.7/email/parser.py", line 124, in parsebytes
return self.parser.parsestr(text, headersonly)
File "/usr/lib/python3.7/email/parser.py", line 68, in parsestr
File "/usr/lib/python3.7/email/parser.py", line 68, in parsestr
return self.parse(StringIO(text), headersonly=headersonly)
File "/usr/lib/python3.7/email/parser.py", line 57, in parse
feedparser.feed(data)
File "/usr/lib/python3.7/email/feedparser.py", line 176, in feed
self._call_parse()
File "/usr/lib/python3.7/email/feedparser.py", line 180, in _call_parse
self._parse()
File "/usr/lib/python3.7/email/feedparser.py", line 385, in _parsegen
for retval in self._parsegen():
File "/usr/lib/python3.7/email/feedparser.py", line 256, in _parsegen
if self._cur.get_content_type() == 'message/delivery-status':
File "/usr/lib/python3.7/email/message.py", line 578, in get_content_type
value = self.get('content-type', missing)
File "/usr/lib/python3.7/email/message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
File "/usr/lib/python3.7/email/policy.py", line 162, in header_fetch_parse
return self.header_factory(name, value)
File "/usr/lib/python3.7/email/headerregistry.py", line 589, in __call__
return self[name](name, value)
File "/usr/lib/python3.7/email/headerregistry.py", line 197, in __new__
cls.parse(value, kwds)
File "/usr/lib/python3.7/email/headerregistry.py", line 446, in parse
kwds['parse_tree'] = parse_tree = cls.value_parser(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2504, in parse_content_type_header
ctype.append(parse_mime_parameters(value[1:]))
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2413, in parse_mime_parameters
token, value = get_invalid_parameter(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 2063, in get_invalid_parameter
token, value = get_phrase(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 1377, in get_phrase
token, value = get_word(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 1340, in get_word
token, value = get_quoted_string(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 1241, in get_quoted_string
token, value = get_bare_quoted_string(value)
File "/usr/lib/python3.7/email/_header_value_parser.py", line 1170, in get_bare_quoted_string
if value[0] == '"':
IndexError: string index out of range
Idéalement toutes les erreurs de l’import seraient traitées de cette façon:
- Affichage de l’erreur avec l’id du message
- Ajout du message dans une mbox d’erreur
De façon à importer tout ce qui est possible au lieu d’échouer, tout en permettant le traitement manuel des messages en échec. Il n’est pas possible d’utiliser la branche master et elle ne semble pas contenir un correctif sur l’import:
# git log --no-merges --oneline 1.3.3..master
5fbf98d Use Angle brackets for In-reply-to header since it is stripped.
538beee Add Python 3.9 for testing.
e8410d2 Replaced deprecated ugettext functions with gettext.
3a004aa Fix typo in example_project/settings.py.
0c73ab3 Warn about mailman_hyperkitty module in Mailman integration
418fe46 Add the ability to disable gravatars.
8dd1a2e Fix a bug where the reply buttons were missing from the replies.
da9c8e6 Fix wrong padding around the navigation buttons in overview
8feb164 Translated using Weblate (Swedish)
5985628 Translated using Weblate (Portuguese)
d04ceed Translated using Weblate (German)
f4f96a3 Add configuration to disable web posting.
e112877 overview.html: fix superuser typo
6791aff Removed isort workaround. isort 5.0.6 is fixed.
68f9325 Extend lock life for update_and_clean_index job and add some doc.
7e9cdbd Translated using Weblate (Russian)
d978014 Translated using Weblate (Portuguese (Brazil))
a3de739 Translated using Weblate (Russian)
a93f5f0 Sync owners and moderators from Mailman Core for MailingList model.
18f9722 Version bump post release.
En cherchant dans les issues contenant import ou bien celles qui ont le tag import, l’idée ne semble pas avoir été proposée. On dirait bien qu’un patch serait pertinent.
En attendant on peut déjà faire les mbox les unes après les autres pour éviter qu’une erreur crash l’ensemble:
# for mbox in /home/debian/spip-dev.mbox/*.mbox ; do echo $mbox ; echo -------------------------------------------- ; python3 manage.py hyperkitty_import --list-address spip-dev@mailman.the.re --since 2000-01-01 --no-sync-mailman $mbox ; done |& tee /var/log/hyperkitty_import.log