[Python-checkins] python/nondist/peps pep-0305.txt,1.4,1.5

davecole@users.sourceforge.net davecole@users.sourceforge.net
Thu, 30 Jan 2003 04:11:31 -0800


Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1:/tmp/cvs-serv25349

Modified Files:
	pep-0305.txt 
Log Message:
Trying to bring PEP up to date with discussions on mailing list.  I hope I
have not misinterpreted the conclusions.
* dialect argument is now either a string to identify one of the internally
  defined parameter sets, otherwise it is an object which contains
  attributes which correspond to the parameter set.
* Altered set_dialect() to take dialect name and dialect object.
* Altered get_dialect() to take dialect name and return dialect object.
* Fleshed out formatting parameters, adding escapechar, lineterminator,
  quoting.


Index: pep-0305.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0305.txt,v
retrieving revision 1.4
retrieving revision 1.5
diff -C2 -d -r1.4 -r1.5
*** pep-0305.txt	29 Jan 2003 14:09:45 -0000	1.4
--- pep-0305.txt	30 Jan 2003 12:11:27 -0000	1.5
***************
*** 106,122 ****
  
  Readers and writers support a dialect argument which is just a
! convenient (string) handle on a group of lower level parameters.
  Dialects will generally be named after applications or organizations
  which define specific sets of format constraints.  The initial dialect
! is "excel2000", which describes the format constraints of Excel 2000's
  CSV format.  Another possible dialect (used here only as an example)
  might be "gnumeric".
  
  Two functions are defined in the API to set and retrieve dialects::
  
!     set_dialect(dialect, pdict)
!     pdict = get_dialect(dialect)
  
! The pdict parameter is a dictionary whose keys are the names the
  formatting parameters defined in the next section.
  
--- 106,145 ----
  
  Readers and writers support a dialect argument which is just a
! convenient handle on a group of lower level parameters.
! 
! When dialect is a string it identifies one of the dialect which is
! known to the module, otherwise it is processed as a dialect class as
! described below.
!  
  Dialects will generally be named after applications or organizations
  which define specific sets of format constraints.  The initial dialect
! is excel2000, which describes the format constraints of Excel 2000's
  CSV format.  Another possible dialect (used here only as an example)
  might be "gnumeric".
  
+ Dialects are implemented as attribute only classes to enable user to
+ construct variant dialects by subclassing.  The excel2000 dialect is
+ implemented as follows::
+ 
+     class excel2000:
+         quotechar = '"'
+         delimiter = ','
+         escapechar = None
+         skipinitialspace = False
+         lineterminator = '\r\n'
+         quoting = 'minimal'
+ 
+ An excel tab separated dialect can then be defined in user code as
+ follows::
+ 
+     class exceltsv(csv.excel2000):
+         delimiter = '\t'
+ 
  Two functions are defined in the API to set and retrieve dialects::
  
!     set_dialect(name, dialect)
!     dialect = get_dialect(name)
  
! The dialect parameter is a class or instance whose attributes are the
  formatting parameters defined in the next section.
  
***************
*** 136,139 ****
--- 159,165 ----
    separator.  It defaults to ','.
  
+ - escapechar specifies a one character string used to escape the
+   delimiter when quotechar is set to None.
+ 
  - skipinitialspace specifies how to interpret whitespace which
    immediately follows a delimiter.  It defaults to False, which means
***************
*** 143,146 ****
--- 169,183 ----
  - lineterminator specifies the character sequence which should
    terminate rows.
+ 
+ - quoting controls when quotes should be generated by the
+   writer.
+ 
+     "minimal" means only when required, for example, when a field
+     contains either the quotechar or the delimiter
+ 
+     "always" means that quotes are always placed around fields.
+ 
+     "nonnumeric" means that quotes are always placed around fields
+     which contain characters other than [+-0-9.].
  
  ... XXX More to come XXX ...