[XML-SIG] lxml 2.0alpha1 released

Gloria W strangest at comcast.net
Sun Sep 2 19:00:40 CEST 2007


Stefan, congratulations. This is definitely useful.
Please talk a bit about the API, and how it differs/varies from 
cElementTree, or link to some examples. For example, the node nesting, 
the usage of a 'tail' for trailing text. I wonder if lxml offers more of 
a DOM compliant node nesting, or if it conforms to the 
conventions/oddities of ElemenTree.
Also show us how it differs from BeautifulSoup, which has extremely 
robust unicode handling and mangled XML/HTML tag completion, but may 
benchmark a bit slower.
Thanks again, and good job!
Gloria



> Hi all,
>
> I'm proudly announcing the first alpha release of lxml 2.0.
>
> http://codespeak.net/lxml/dev/
> http://pypi.python.org/pypi/lxml/2.0alpha1
>
> ** What is lxml?
>
> """
> In short: lxml is the most feature-rich and easy-to-use library for working
> with XML and HTML in the Python language.
>
> lxml is a Pythonic binding for the libxml2 and libxslt libraries. It is unique
> in that it combines the speed and feature completeness of these libraries with
> the simplicity of a native Python API.
> """
>
> This release features a major cleanup both behind the scenes and at the
> surface, that improves the XML tool integration and makes the API clearer and
> more consistent in many places. The major new addition, however, is the
> lxml.html package, a new toolkit for HTML handling.
>
> The web site for the pre-2.0 series is online at
>
> http://codespeak.net/lxml/dev/
>
> The "what's new" page has a description of the major changes:
>
> http://codespeak.net/lxml/dev/lxml2.html
>
> and the ChangeLog has a more detailed list, see below.
>
> This being an alpha release means that not everything is stable, both in terms
> of crashes and the API. There will be a small number of alpha releases to make
> the advancements publicly available, before the beta releases focus on
> improving the stability.
>
>
> I warmly invite everyone to contribute to the final release by discussing the
> API changes and the new features on the mailing list. There is always space
> for improvements!
>
>
> There is currently a known problem with Microsoft's compilers, so Windows
> builds may not become available for 2.0alpha1. The next alpha will hopefully
> come with prebuilt binaries for that platform. Building with the more
> standards compliant MinGW compilers should work.
>
> Note that working on the code now requires Cython (version 0.9.6.5), an
> enhanced fork of Pyrex.  lxml therefore no longer ships with a copy of Pyrex
> or Cython, but as usual, building from the distribution sources does not
> require Cython.  It can be installed with "easy_install Cython" or from here:
>
> http://www.cython.org/
>
> I hope that lxml 2.0 will become a straight continuation of the success story
> that lxml 1.x was already.
>
> Have fun,
> Stefan
>
>
> 2.0alpha1 (2007-09-02)
> Features added
>
>     * Reimplemented objectify.E for better performance and improved
>       integration with objectify. Provides extended type support based on
>       registered PyTypes.
>     * XSLT objects now support deep copying
>     * New makeSubElement() C-API function that allows creating a new
>       subelement straight with text, tail and attributes.
>     * XPath extension functions can now access the current context node
>       (context.context_node) and use a context dictionary
>       (context.eval_context) from the context provided in their first
>       parameter
>     * HTML tag soup parser based on BeautifulSoup in lxml.html.ElementSoup
>     * New module lxml.doctestcompare by Ian Bicking for writing simplified
>       doctests based on XML/HTML output. Use by importing lxml.usedoctest or
>       lxml.html.usedoctest from within a doctest.
>     * New module lxml.cssselect by Ian Bicking for selecting Elements with
>       CSS selectors.
>     * New package lxml.html written by Ian Bicking for advanced HTML
>       treatment.
>     * Namespace class setup is now local to the ElementNamespaceClassLookup
>       instance and no longer global.
>     * Schematron validation (incomplete in libxml2)
>     * Additional stringify argument to objectify.PyType() takes a conversion
>       function to strings to support setting text values from arbitrary types.
>     * Entity support through an Entity factory and element classes. XML
>       parsers now have a resolve_entities keyword argument that can be set to
>       False to keep entities in the document.
>     * column field on error log entries to accompany the line field
>     * Error specific messages in XPath parsing and evaluation
>       NOTE: for evaluation errors, you will now get an XPathEvalError instead
>       of an XPathSyntaxError. To catch both, you can except on XPathError.
>     * The regular expression functions in XPath now support passing a node-set
>       instead of a string
>     * Extended type annotation in objectify: new xsiannotate() function
>     * EXSLT RegExp support in standard XPath (not only XSLT)
>
> Bugs fixed
>
>     * lxml.etree did not check tag/attribute names
>     * The XML parser did not report undefined entities as error
>     * The text in exceptions raised by XML parsers, validators and XPath
>       evaluators now reports the first error that occurred instead of the last
>     * Passing '' as XPath namespace prefix did not raise an error
>     * Thread safety in XPath evaluators
>
> Other changes
>
>     * objectify.PyType for None is now called "NoneType"
>     * el.getiterator() renamed to el.iter(), following ElementTree 1.3 -
>       original name is still available as alias
>     * In the public C-API, findOrBuildNodeNs() was replaced by the more
>       generic findOrBuildNodeNsPrefix
>     * Major refactoring in XPath/XSLT extension function code
>     * Network access in parsers disabled by default
>
>
> _______________________________________________
> XML-SIG maillist  -  XML-SIG at python.org
> http://mail.python.org/mailman/listinfo/xml-sig
>
>   



More information about the XML-SIG mailing list