This issue tracker has been migrated to GitHub, and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: bad arg type to isspace in struct module
Type: Stage:
Components: Library (Lib) Versions: Python 2.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: nnorwitz Nosy List: edemaine, gregm, nnorwitz, rhettinger
Priority: normal Keywords:

Created on 2004-11-23 23:35 by gregm, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Messages (4)
msg23286 - (view) Author: Greg McFarlane (gregm) Date: 2004-11-23 23:35
For characters greater than 0x7f, the calls to
isspace() in Modules/structmodule.c can return random
values.  For example, on Solaris I got this (incorrect)
output:

>>> import struct
>>> struct.calcsize('10d\xfed')
88
>>> 

After changing the three occurrences of
"isspace((int)c)" to "isspace((unsigned char)c)", this
was the (correct) output:

>>> import struct
>>> struct.calcsize('10d\xfed')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
struct.error: bad char in struct format
>>> 

Reason: the '\xfe' is taken as a signed char.  The code
(int)c converts this to a signed int (-2).  The system
isspace macro uses this as an index into the __ctype
array.  The array is only defined for the values 0 to
255 and so -2 is out-of-bounds.  The value returned by
isspace depends on what happens to be at that location
in memory.

NOTE: There may be other occurrences of this bug in
other parts of the python code.  Please check.
msg23287 - (view) Author: Erik Demaine (edemaine) Date: 2004-11-24 04:52
Logged In: YES 
user_id=265183

Looking at other instances of isspace and friends, I think
this is the point of calling Py_CHARMASK, which is used to
"Convert a possibly signed character to a nonnegative int"
(depending on whether 'char' is signed or unsigned).

In other words, I believe the three instances of
isspace((int)c) in Modules/structmodule.c should be changed
to isspace(Py_CHARMASK(c)).

`grep isspace */*.c | grep -v CHARMASK` suggests some other
potential bugs:

- Modules/posixmodule.c:466 (os2_formatmsg): isspace(*lastc)
- Modules/socketmodule.c:504 (set_error): isspace(*lastc)

`egrep
'isalnum|isalpha|isascii|isblank|iscntrl|isdigit|isgraph|islower|isprint|isprint|ispunct|isupper|isxdigit'
*/*.c | grep -v Py_CHARMASK` suggest the following further bugs:

- Modules/_hotshot.c:1431 (get_version_string):
isdigit((int)*rev) [unlikely to cause trouble, but wrong in
the same way...]
- Modules/_tkinter.c:639 (Tkapp_New): isupper((int)argv0[0])
- Modules/pyexpat.c:1800  (get_version_string):
isdigit((int)*rev) [again unlikely a problem]
- Modules/stropmodule.c:760 (strop_atoi): isalnum((int)end[-1])
ye
- Parser/grammar.c:183 (translabel):
isalpha((int)(lb->lb_str[1]))
- Parser/tokenizer.c:232 (get_coding_spec): isalnum((int)t[0])
- Parser/atof.c:16 (atof): (c = *s++) != '\0' && isdigit(c)
 [same problem appears three times in the same function]
- Python/compile.c:1721,1727 (parsestr): int quote = *s; ...
isalpha(quote)
- Python/dynload_aix.c:147 (aix_loaderror): isdigit(*message[i])
- Python/getargs.c:141 (vgetargs1): int c = *format++  (and
later, isalpha(c))
- Python/getargcs.c:258 (vgetargs1): isalpha((int)(*format))
- Python/getargs.c:336 (converttuple): int c = *format++ 
(and later, isalpha(c))
- Python/getargs.c:1222 (vgetargskeyword): i = *format++ 
(and later, isalpha(i))

That's all that I could find.
msg23288 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2005-08-26 08:06
Logged In: YES 
user_id=80475

Fixed the ones in the struct module.  Leaving this report
open for someone with time to investigate other occurrences.
msg23289 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2005-12-19 06:06
Logged In: YES 
user_id=33168

Committed revision 41768. (for 2.5)
History
Date User Action Args
2022-04-11 14:56:08adminsetgithub: 41221
2004-11-23 23:35:52gregmcreate