[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

Hirokazu Yamamoto report at bugs.python.org
Tue Mar 18 08:15:58 CET 2008


Hirokazu Yamamoto <ocean-city at users.sourceforge.net> added the comment:

> I tried to fix this problem, but I'm not sure how to fix this.

Quick observation...

///////////////////////////////////
// Possible Solution

1. Convert err->text to console compatible encoding (not to source
encoding like in python2.x) where PyTokenizer_RestoreEncoding is there.

2. err->text is UTF-8, actual output is done in
Python/pythonrun.c(print_error_text), so adjust offset there.

///////////////////////////////////
// Solution requires...
1.
  - PyUnicode_DecodeUTF8 in Python/pythonrun.c(err_input) should
    be changed to some kind of "bytes" API.

  - The way to write "bytes" to File object directly is needed.

2.
  - The way to know actual byte length of given unicode + encoding.

////////////////////////////////////////////////////
// Experimental patch

Attached as experimental patch of solution 2. Looks agly, but
seems working on my environment.
 (I assumed get_length_in_bytes(f, " ", 1) == 1 but I'm not sure
  this is always true in other platforms. Probably nicer and more
  general solution may exist)

----------
keywords: +patch
Added file: http://bugs.python.org/file9723/experimental.patch

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2382>
__________________________________


More information about the Python-bugs-list mailing list