[Python-checkins] CVS: python/nondist/peps pep-0263.txt,1.10,1.11
M.-A. Lemburg
lemburg@users.sourceforge.net
Fri, 15 Mar 2002 09:07:14 -0800
Update of /cvsroot/python/python/nondist/peps
In directory usw-pr-cvs1:/tmp/cvs-serv21038
Modified Files:
pep-0263.txt
Log Message:
Changed Python's source code encoding default to ASCII.
Added note about handling of Unicode literals in phase 1.
Index: pep-0263.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0263.txt,v
retrieving revision 1.10
retrieving revision 1.11
diff -C2 -d -r1.10 -r1.11
*** pep-0263.txt 7 Mar 2002 11:14:26 -0000 1.10
--- pep-0263.txt 15 Mar 2002 17:07:12 -0000 1.11
***************
*** 41,48 ****
Defining the Encoding
! Just as in coercion of strings to Unicode, Python will default to
! the interpreter's default encoding (which is ASCII in standard
! Python installations) as standard encoding if no other encoding
! hints are given.
To define a source code encoding, a magic comment must
--- 41,46 ----
Defining the Encoding
! Python will default to ASCII as standard encoding if no other
! encoding hints are given.
To define a source code encoding, a magic comment must
***************
*** 77,86 ****
source code.
! Any encoding which allows processing the first two lines in
! the way indicated above is allowed as source code encoding,
! this includes ASCII compatible encodings as well as certain
multi-byte encodings such as Shift_JIS. It does not include
! encodings which use two or more bytes for all characters
! like e.g. UTF-16. The reason for this is to keep the encoding
detection algorithm in the tokenizer simple.
--- 75,84 ----
source code.
! Any encoding which allows processing the first two lines in the
! way indicated above is allowed as source code encoding, this
! includes ASCII compatible encodings as well as certain
multi-byte encodings such as Shift_JIS. It does not include
! encodings which use two or more bytes for all characters like
! e.g. UTF-16. The reason for this is to keep the encoding
detection algorithm in the tokenizer simple.
***************
*** 117,133 ****
require major changes in the internals of the interpreter and
enforcing the use of magic comments in source code files which
! place non-default encoding characters in string literals, comments
and Unicode literals, the proposed solution should be implemented
in two phases:
! 1. Implement the magic comment detection and default encoding
! handling, but only apply the detected encoding to Unicode
! literals in the source file.
In addition to this step and to aid in the transition to
explicit encoding declaration, the tokenizer must check the
! complete source file for compliance with the default encoding
! (which usually is ASCII). If the source file does not properly
! decode, a single warning is generated per file.
2. Change the tokenizer/compiler base string type from char* to
--- 115,134 ----
require major changes in the internals of the interpreter and
enforcing the use of magic comments in source code files which
! place non-ASCII characters in string literals, comments
and Unicode literals, the proposed solution should be implemented
in two phases:
! 1. Implement the magic comment detection, but only apply the
! detected encoding to Unicode literals in the source file.
!
! If no magic comment is used, Python should continue to
! use the standard [raw-]unicode-escape codecs for Unicode
! literals.
In addition to this step and to aid in the transition to
explicit encoding declaration, the tokenizer must check the
! complete source file for compliance with the declared
! encoding. If the source file does not properly decode, a single
! warning is generated per file.
2. Change the tokenizer/compiler base string type from char* to