ANN: pyparsing-1.3.3 released

Paul McGuire ptmcg at austin.rr.com
Tue Sep 13 06:56:13 CEST 2005


Pyparsing 1.3.3 contains mostly bugfixes and minor enhancements over
previous releases, including some improvement in Unicode support.  Here
are the change notes:

Version 1.3.3 - September 12, 2005
----------------------------------
- Improved support for Unicode strings that would be returned using
  srange.  Added greetingInKorean.py example, for a Korean version of
  "Hello, World!" using Unicode. (Thanks, June Kim!)

- Added 'hexnums' string constant (nums+"ABCDEFabcdef") for defining
  hexadecimal value expressions.

- NOTE: ===THIS CHANGE MAY BREAK EXISTING CODE===
  Modified tag and results definitions returned by makeHTMLTags(),
  to better support the looseness of HTML parsing.  Tags to be
  parsed are now caseless, and keys generated for tag attributes are
  now converted to lower case.

  Formerly, makeXMLTags("XYZ") would return a tag with results
  name of "startXYZ", this has been changed to "startXyz".  If this
  tag is matched against '<XYZ Abc="1" DEF="2" ghi="3">', the
  matched keys formerly would be "Abc", "DEF", and "ghi"; keys are
  now converted to lower case, giving keys of "abc", "def", and
  "ghi".  These changes were made to try to address the lax
  case sensitivity agreement between start and end tags in many
  HTML pages.

  No changes were made to makeXMLTags(), which assumes more rigorous
  parsing rules.

  Also, cleaned up case-sensitivity bugs in closing tags, and
  switched to using Keyword instead of Literal class for tags.
  (Thanks, Steve Young, for getting me to look at these in more
  detail!)

- Added two helper parse actions, upcaseTokens and downcaseTokens,
  which will convert matched text to all uppercase or lowercase,
  respectively.

- Deprecated Upcase class, to be replaced by upcaseTokens parse
  action.

- Converted messages sent to stderr to use warnings module, such as
  when constructing a Literal with an empty string, one should use
  the Empty() class or the empty helper instead.

- Added ' ' (space) as an escapable character within a quoted
  string.

- Added helper expressions for common comment types, in addition
  to the existing cStyleComment (/*...*/) and htmlStyleComment
  (<!-- ... -->)
  . dblSlashComment = // ... (to end of line)
  . cppStyleComment = cStyleComment or dblSlashComment
  . javaStyleComment = cppStyleComment
  . pythonStyleComment = # ... (to end of line)



Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul


========================================

Pyparsing is a pure-Python class library for quickly developing
recursive-descent parsers.  Parser grammars are assembled directly in
the calling Python code, using classes such as Literal, Word,
OneOrMore, Optional, etc., combined with operators '+', '|', and '^'
for And, MatchFirst, and Or.  No separate code-generation or external
files are required.  Pyparsing can be used in many cases in place of
regular expressions, with shorter learning curve and greater
readability and maintainability.  Pyparsing comes with a number of
parsing examples, including:
- "Hello, World!" (English and Korean)
- chemical formulas
- configuration file parser
- web page URL extractor
- 5-function arithmetic expression parser
- subset of CORBA IDL
- chess portable game notation
- simple SQL parser
- Mozilla calendar file parser
- EBNF parser/compiler



More information about the Python-announce-list mailing list