[Patches] [ python-Patches-680146 ] _iconv_module type casting and spelling errors

SourceForge.net noreply@sourceforge.net
Tue, 04 Feb 2003 12:12:08 -0800


Patches item #680146, was opened at 2003-02-04 14:09
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=680146&group_id=5470

Category: Modules
Group: Python 2.3
Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Christos Georgiou (tzot)
Assigned to: Walter Dörwald (doerwalter)
Summary: _iconv_module type casting and spelling errors

Initial Comment:
Compiling under SGI Irix and MIPSPro, I corrected two 
type casts which under gcc are plain warnings.  They 
are both related to size_t and int comparisons, since 
size_t is unsigned.

I also corrected the spelling mistake of "choosen" 
instead of "chosen"

----------------------------------------------------------------------

>Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 22:12

Message:
Logged In: YES 
user_id=539787

It was a misspelling; ISO8859-7 is what I used in the module, 
the underscore came from my using it with str.decode
("iso8859_7").

o2k 348# ls -l python
-rwxr-xr-x    1 root     sys       1976320 Feb  4 21:21 python
o2k 349# ./python
Python 2.3a1 (#12, Feb  4 2003, 21:14:08) [C] on irix646
Type "help", "copyright", "credits" or "license" for more 
information.
>>> "a".decode("ascii")
Fatal Python error: can't initialize the _iconv_codec module: 
iconv_open() failed
Abort (core dumped)

I then changed it to:
    iconv_t hdl = iconv_open("UCS-2", "LATIN1");
(sys.maxunicode is 65535)
and it still fails at the same line (I ran "dbx python core").  I 
made sure that I use the freshly built _iconv_codec.so (ie no 
other exists on the system).
I added some code to show the errno, and it's 22 (EINVAL).
I ran iconv -l at the prompt, and used UCS-2 and ISO8859-1 
(LATIN1 didn't show up in the list).  Now, the code runs fine 
for str.decode and unicode.encode, and test.test_codecs and 
test.test_263 pass fine.

(BTW, if you're used to vi keystrokes, never press ESC while 
typing with IE in the "Add a comment:" text box... :( )

I then applied the diff-debug patch, and got:
>>> "a".decode("ascii")
init_iconv_codec: 0x0030
u'a'

Please note that on Irix, UCS-2-INTERNAL is not available, 
only UCS-2, so that had to be changed too.

It's an initialisation thing; I could provide a special case of 
defines for Irix, but how can you know which codecs are 
available on a platform during the configure process?  
Perhaps running iconv -l and trying to decode the output? 
iconv -l on GNU/linux shows a comma separated list,  while 
on Irix it shows pairs of available conversions, one on each 
line...

Thanks for the direction about encodings/__init__.py, Walter.

Thanks guys, now I must find if pymalloc has changed and 
occasionally dumps core in dictresize or listextend_internal...

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2003-02-04 21:25

Message:
Logged In: YES 
user_id=89016

Of course you can always comment out the import in
encodings/__init__.py

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 21:09

Message:
Logged In: YES 
user_id=539787

With patch applied:

Python 2.3a1 (#12, Feb  4 2003, 21:30:08) [C] on irix646
Type "help", "copyright", "credits" or "license" for more 
information.
>>> "aaa".decode("ascii")
Fatal Python error: can't initialize the _iconv_codec module: 
iconv_open() failed
Abort (core dumped)

It still fails after a clean untar + patch.
How can I disable it so that on Irix I get the old functionality?  
Is there an official way?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 21:09

Message:
Logged In: YES 
user_id=539787

With patch applied:

Python 2.3a1 (#12, Feb  4 2003, 21:30:08) [C] on irix646
Type "help", "copyright", "credits" or "license" for more 
information.
>>> "aaa".decode("ascii")
Fatal Python error: can't initialize the _iconv_codec module: 
iconv_open() failed
Abort (core dumped)

It still fails after a clean untar + patch.
How can I disable it so that on Irix I get the old functionality?  
Is there an official way?

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2003-02-04 21:00

Message:
Logged In: YES 
user_id=89016

The current code of course still uses 'ASCII'.

However, if the compiler warnings are gone we should
continue discussion on patch #670715.

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 20:52

Message:
Logged In: YES 
user_id=539787

With patch applied:

Python 2.3a1 (#12, Feb  4 2003, 21:30:08) [C] on irix646
Type "help", "copyright", "credits" or "license" for more 
information.
>>> "aaa".decode("ascii")
Fatal Python error: can't initialize the _iconv_codec module: 
iconv_open() failed
Abort (core dumped)

It still fails after a clean untar + patch.
How can I disable it so that on Irix I get the old functionality?  
Is there an official way?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 20:16

Message:
Logged In: YES 
user_id=539787

!         char res = iconv(self->dechdl, (char**)&inp, &inplen, 
&out, &outlen);

a size_t stored in a char? Ouch!!!  That was indeed awful (but 
obviously not for gcc, which does not store values to char 
variables masked with 0xff).
Core dumps disappeared too, but I will start from scratch (tar 
xvzf ...) and reapply the patch, because I don't trust the 
changes I did to setup.py.
I'll let you know soon, thanks Neil & Walter for your 
promptness.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2003-02-04 20:08

Message:
Logged In: YES 
user_id=89016

OK, I've checked in Neals patch as Modules/_iconv_codec.c 1.8

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2003-02-04 19:58

Message:
Logged In: YES 
user_id=89016

Argl, Neal was faster! ;)

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-04 19:58

Message:
Logged In: YES 
user_id=33168

Attaching a patch with more changes.

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2003-02-04 19:54

Message:
Logged In: YES 
user_id=89016

OK, I checked in a slightly different version. Can you check
whether the warnings are still gone?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 19:51

Message:
Logged In: YES 
user_id=539787

Yes, it makes sense (and no compilation errors or warnings).  
Now, if only I could manage to ignore completely iconv, 
because Irix iconv has no ASCII encoding and there are 
problems with the big-endianess (see my notes to patch 
670715).
I need to delve into the problem with big-endian architecture, 
but first I would like to compile ignoring completely the 
_iconv_codec module; I do not know how to do that, though. 
(I'm messing with setup.py now).
We can close this patch here, thank you.

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 19:32

Message:
Logged In: YES 
user_id=539787

The code is:

res = iconv(hdl, &inptr, &insize, &outptr, (size_t *)&outsize);
    if (res == (size_t)-1)

MIPSPro compiler 7.3 unfortunately throws an error instead of 
a warning, even if res is int, because iconv returns size_t, 
which is unsigned.  This is erroneous on MIPSPro's behalf, 
but it's harmless.

I'm gonna check your patch in a quarter.

----------------------------------------------------------------------

Comment By: Neal Norwitz (nnorwitz)
Date: 2003-02-04 19:26

Message:
Logged In: YES 
user_id=33168

I'm not sure why you've case res == (size_t)-1.  All other
changes make sense.  I've modified the patch a bit so that
insize and outsize are size_t instead of int.  Does this
make sense and work for you?

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 14:28

Message:
Logged In: YES 
user_id=539787

Third attempt: the checkbox is checked

----------------------------------------------------------------------

Comment By: Christos Georgiou (tzot)
Date: 2003-02-04 14:12

Message:
Logged In: YES 
user_id=539787

For some reason the attachment was not uploaded the first 
time.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=680146&group_id=5470