[Python-checkins] python/dist/src/Lib/email Header.py,1.19,1.20

bwarsaw@users.sourceforge.net bwarsaw@users.sourceforge.net
Thu, 06 Mar 2003 08:10:34 -0800


Update of /cvsroot/python/python/dist/src/Lib/email
In directory sc8-pr-cvs1:/tmp/cvs-serv22423

Modified Files:
	Header.py 
Log Message:
__unicode__(): When converting to a unicode string, we need to
preserve spaces in the encoded/unencoded word boundaries.  RFC 2047 is
ambiguous here, but most people expect the space to be preserved.
Really closes SF bug # 640110.


Index: Header.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/email/Header.py,v
retrieving revision 1.19
retrieving revision 1.20
diff -C2 -d -r1.19 -r1.20
*** Header.py	6 Mar 2003 06:37:42 -0000	1.19
--- Header.py	6 Mar 2003 16:10:30 -0000	1.20
***************
*** 29,34 ****
--- 29,36 ----
  NL = '\n'
  SPACE = ' '
+ USPACE = u' '
  SPACE8 = ' ' * 8
  EMPTYSTRING = ''
+ UEMPTYSTRING = u''
  
  MAXLINELEN = 76
***************
*** 205,211 ****
      def __unicode__(self):
          """Helper for the built-in unicode function."""
!         # charset item is a Charset instance so we need to stringify it.
!         uchunks = [unicode(s, str(charset)) for s, charset in self._chunks]
!         return u''.join(uchunks)
  
      # Rich comparison operators for equality only.  BAW: does it make sense to
--- 207,228 ----
      def __unicode__(self):
          """Helper for the built-in unicode function."""
!         uchunks = []
!         lastcs = None
!         for s, charset in self._chunks:
!             # We must preserve spaces between encoded and non-encoded word
!             # boundaries, which means for us we need to add a space when we go
!             # from a charset to None/us-ascii, or from None/us-ascii to a
!             # charset.  Only do this for the second and subsequent chunks.
!             nextcs = charset
!             if uchunks:
!                 if lastcs is not None:
!                     if nextcs is None or nextcs == 'us-ascii':
!                         uchunks.append(USPACE)
!                         nextcs = None
!                 elif nextcs is not None and nextcs <> 'us-ascii':
!                     uchunks.append(USPACE)
!             lastcs = nextcs
!             uchunks.append(unicode(s, str(charset)))
!         return UEMPTYSTRING.join(uchunks)
  
      # Rich comparison operators for equality only.  BAW: does it make sense to