[Python-checkins] python/dist/src/Doc/whatsnew whatsnew23.tex,1.55,1.56

Wed, 09 Oct 2002 05:11:13 -0700

Update of /cvsroot/python/python/dist/src/Doc/whatsnew
In directory usw-pr-cvs1:/tmp/cvs-serv5278

Modified Files:
	whatsnew23.tex 
Log Message:
Minor edits and markup fixes

Index: whatsnew23.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/whatsnew/whatsnew23.tex,v
retrieving revision 1.55
retrieving revision 1.56
diff -C2 -d -r1.55 -r1.56
*** whatsnew23.tex	7 Oct 2002 19:01:07 -0000	1.55
--- whatsnew23.tex	9 Oct 2002 12:11:10 -0000	1.56
***************
*** 317,338 ****

  On Windows NT, 2000, and XP, the system stores file names as Unicode
! strings. Traditionally, Python has represented file names are byte
! strings, which is inadequate since it renders some file names
  inaccessible.

! Python allows now to use arbitrary Unicode strings (within limitations
! of the file system) for all functions that expect file names, in
! particular \function{open}. If a Unicode string is passed to
! \function{os.listdir}, Python returns now a list of Unicode strings.
! A new function \function{getcwdu} returns the current directory as a
! Unicode string.

! Byte strings continue to work as file names, the system will
! transparently convert them to Unicode using the \code{mbcs} encoding.

! Other systems allow Unicode strings as file names as well, but convert
! them to byte strings before passing them to the system, which may
! cause UnicodeErrors. Applications can test whether arbitrary Unicode
! strings are supported as file names with \code{os.path.unicode_file_names}.

  \begin{seealso}
--- 317,339 ----

  On Windows NT, 2000, and XP, the system stores file names as Unicode
! strings. Traditionally, Python has represented file names as byte
! strings, which is inadequate because it renders some file names
  inaccessible.

! Python now allows using arbitrary Unicode strings (within the
! limitations of the file system) for all functions that expect file
! names, in particular the \function{open()} built-in. If a Unicode
! string is passed to \function{os.listdir}, Python now returns a list
! of Unicode strings.  A new function, \function{os.getcwdu()}, returns
! the current directory as a Unicode string.

! Byte strings still work as file names, and Python will transparently
! convert them to Unicode using the \code{mbcs} encoding.

! Other systems also allow Unicode strings as file names, but convert
! them to byte strings before passing them to the system which may cause
! a \exception{UnicodeError} to be raised. Applications can test whether
! arbitrary Unicode strings are supported as file names by checking
! \member{os.path.unicode_file_names}, a Boolean value.

  \begin{seealso}
***************
*** 494,515 ****

  When encoding a Unicode string into a byte string, unencodable
! characters may be encountered. So far, Python allowed to specify the
! error processing as either ``strict'' (raise \code{UnicodeError},
! default), ``ignore'' (skip the character), or ``replace'' (with
! question mark). It may be desirable to specify an alternative
! processing of the error, e.g. by inserting an XML character reference
! or HTML entity reference into the converted string.

  Python now has a flexible framework to add additional processing
! strategies; new error handlers can be added with
  \function{codecs.register_error}. Codecs then can access the error
! handler with \code{codecs.lookup_error}. An equivalent C API has been
! added for codecs written in C. The error handler gets various state
! information, such as the string being converted, the position in the
! string where the error was detected, and the target encoding. It can
! then either raise an exception, or return a replacement string.

  Two additional error handlers have been implemented using this
! framework: ``backslashreplace'' using Python backslash quoting to
  represent the unencodable character, and ``xmlcharrefreplace'' emits
  XML character references.
--- 495,518 ----

  When encoding a Unicode string into a byte string, unencodable
! characters may be encountered.  So far, Python has allowed specifying
! the error processing as either ``strict'' (raising
! \exception{UnicodeError}), ``ignore'' (skip the character), or
! ``replace'' (with question mark), defaulting to ``strict''. It may be
! desirable to specify an alternative processing of the error, e.g. by
! inserting an XML character reference or HTML entity reference into the
! converted string.

  Python now has a flexible framework to add additional processing
! strategies.  New error handlers can be added with
  \function{codecs.register_error}. Codecs then can access the error
! handler with \function{codecs.lookup_error}. An equivalent C API has
! been added for codecs written in C. The error handler gets the
! necessary state information, such as the string being converted, the
! position in the string where the error was detected, and the target
! encoding.  The handler can then either raise an exception, or return a
! replacement string.

  Two additional error handlers have been implemented using this
! framework: ``backslashreplace'' uses Python backslash quoting to
  represent the unencodable character, and ``xmlcharrefreplace'' emits
  XML character references.
***************
*** 518,522 ****

  \seepep{293}{Codec Error Handling Callbacks}{Written and implemented by 
! Walter Dörwald.}

  \end{seealso}
--- 521,525 ----

  \seepep{293}{Codec Error Handling Callbacks}{Written and implemented by 
! Walter D\"orwald.}

  \end{seealso}