[ python-Bugs-1178484 ] Erroneous line number error in Py2.4.1

SourceForge.net noreply at sourceforge.net
Mon May 16 10:35:49 CEST 2005


Bugs item #1178484, was opened at 2005-04-07 14:33
Message generated for change (Comment added) made by doerwalter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1178484&group_id=5470

Category: Parser/Compiler
Group: Python 2.4
Status: Open
Resolution: Accepted
Priority: 5
Submitted By: Timo Linna (tilinna)
>Assigned to: Martin v. Löwis (loewis)
Summary: Erroneous line number error in Py2.4.1

Initial Comment:
For some reason Python 2.3.5 reports the error in the 
following program correctly: 

  File "C:\Temp\problem.py", line 7 
SyntaxError: unknown decode error 

..whereas Python 2.4.1 reports an invalid line number: 

  File "C:\Temp\problem.py", line 2 
SyntaxError: unknown decode error 

----- problem.py starts ----- 
# -*- coding: ascii -*- 

""" 
Foo bar 
""" 

# Ä is not allowed in ascii coding 
----- problem.py ends -----

Without the encoding declaration both Python versions 
report the usual deprecation warning (just like they 
should be doing). 

My environment: Windows 2000 + SP3. 


----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2005-05-16 10:35

Message:
Logged In: YES 
user_id=89016

OK, here is a patch. It adds an additional argument
firstline to read(). If this argument is true (i.e. if
called from readline()) and a decoding error happens, this
error will only be reported if it is in the first line.
Otherwise read() will decode up to the error position and
put the rest in the bytebuffer.

Unfortunately with this patch, I get a segfault with the
following stacktrace if I run the test. I don't know if this
is related to bug #1089395/patch #1101726. Martin, can you
take a look?

#0  0x08057ad1 in tok_nextc (tok=0x81ca7b0) at tokenizer.c:719
#1  0x08058558 in tok_get (tok=0x81ca7b0,
p_start=0xbffff3d4, p_end=0xbffff3d0) at tokenizer.c:1075
#2  0x08059331 in PyTokenizer_Get (tok=0x81ca7b0,
p_start=0xbffff3d4, p_end=0xbffff3d0) at tokenizer.c:1466
#3  0x080561b1 in parsetok (tok=0x81ca7b0, g=0x8167980,
start=257, err_ret=0xbffff440, flags=0) at parsetok.c:125
#4  0x0805613c in PyParser_ParseFileFlags (fp=0x816bdb8,
filename=0xbffff7b7 "./bug.py", g=0x8167980, start=257,
ps1=0x0, ps2=0x0, 
    err_ret=0xbffff440, flags=0) at parsetok.c:90
#5  0x080f3926 in PyParser_SimpleParseFileFlags
(fp=0x816bdb8, filename=0xbffff7b7 "./bug.py", start=257,
flags=0)
    at pythonrun.c:1345
#6  0x080f352b in PyRun_FileExFlags (fp=0x816bdb8,
filename=0xbffff7b7 "./bug.py", start=257, globals=0xb7d62e94, 
    locals=0xb7d62e94, closeit=1, flags=0xbffff544) at
pythonrun.c:1239
#7  0x080f22f2 in PyRun_SimpleFileExFlags (fp=0x816bdb8,
filename=0xbffff7b7 "./bug.py", closeit=1, flags=0xbffff544)
    at pythonrun.c:860
#8  0x080f1b16 in PyRun_AnyFileExFlags (fp=0x816bdb8,
filename=0xbffff7b7 "./bug.py", closeit=1, flags=0xbffff544)
    at pythonrun.c:664
#9  0x08055e45 in Py_Main (argc=2, argv=0xbffff5f4) at
main.c:484
#10 0x08055366 in main (argc=2, argv=0xbffff5f4) at python.c:23

----------------------------------------------------------------------

Comment By: Walter Dörwald (doerwalter)
Date: 2005-04-07 16:28

Message:
Logged In: YES 
user_id=89016

The reason for this is the new codec buffering code in 2.4:
The codec might read and decode more data from the byte
stream than is neccessary for decoding one line. I.e. when
reading line n, the codec might decode bytes that belong to
line n+1, n+2 etc. too. If there's a decoding error in this
data, line n gets reported. I don't think there's a simple
fix for this.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1178484&group_id=5470


More information about the Python-bugs-list mailing list