[New-bugs-announce] [issue6922] Interpreter hangs up while trying to decode invalid utf32 stream.

Alex report at bugs.python.org
Wed Sep 16 19:38:16 CEST 2009


New submission from Alex <malicious.wizard at gmail.com>:

*** Prerequisites:
Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit
(Intel)] on win32

*** Description:
'utf_32_le' and 'utf_32_be' codecs are overconsuming memory when input
data are damaged and kwarg 'errors' to str.decode is other than 'strict'.

*** Steps:
1. Start interpreter
2. Type:
   '\x01'.decode('utf_32_le', 'replace')
or
   '\x01'.decode('utf32', 'ignore')
or
   ('something'.encode('utf32') + '\x00').decode('utf32', 'ignore')
3. Execute

*** Notes:
1. seems like any stream raising UnicodeDecodeError in 'strict' mode
causes hangup in 'ignore' or 'replace'.

*** Expected result:
1. AssertionError on "assert errors == 'strict'" raised, just as
bz2_codec does, if utf32 cannot be partially decoded at all.
2. Behaviour that 'utf8' and 'utf16' implement for such cases.

*** Received result:
1. Interpreter hangs, uses up to 100% of CPU kernel and starts to
consume RAM.
2. Grows large enough to consume all the RAM it could get (takes up to
several minutes on my machine).
3. Produces following traceback:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python26\lib\encodings\utf_32_be.py", line 11, in decode
    return codecs.utf_32_be_decode(input, errors, True)
MemoryError
4. Sometimes traceback is printed, but text "MemoryError" is not, just
leaving blank line in the place.

----------
components: Interpreter Core, Library (Lib), Unicode, Windows
messages: 92704
nosy: mwizard
severity: normal
status: open
title: Interpreter hangs up while trying to decode invalid utf32 stream.
versions: Python 2.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6922>
_______________________________________


More information about the New-bugs-announce mailing list