[XML-SIG] prepare_input_source and relative path

Mike Brown mike at skew.org
Thu Feb 10 00:06:31 CET 2005


Sylvain Thénault wrote:
> thanks a lot. Actually almost all the work is already done right there. 
> Here is what I've worked on. Once we'll reach a consensus, I'll add that
> to pyxml. So I've joined to this mail:
> 
> - a light version of 4Suite Uri.py including the following functions:
>   SplitUriRef, UnsplitUriRef (it was really less annoying to use those
>   two functions than the equivalent urllib's ones), Absolutize,
>   MakeUrllibSafe, _RemoveDotSegments, BaseJoin, GetScheme and
>   IsAbsolute. With the presented solution, the 3 last ones are not used
>   and could be removed, but I've kept them in for now. 

Doc strings will need to be updated to reflect the promotion from
"rfc2396bis" to RFC 3986. Also there's one place where I have "RFC
(newline)2396bis" which should also be fixed.

In MakeUrllibSafe, you should catch the UnicodeError that could result
from the attempt to force unicode to a byte string:

    if isinstance(uri, unicode):
        try:
            uri = uri.encode('us-ascii')
        except UnicodeError:
            raise ValueError("uri %r must consist of ASCII characters." % uri)

> Every tests for Absolutize from 4suite are still passing.

I forgot to point you to my tests. They do not use unittest, so they
would need to be adapted, but it would be easy since the comparisons
are string-in to string-out (or exception), and I've labeled them
pretty clearly:

  http://cvs.4suite.org/viewcvs/4Suite/test/Lib/test_uri.py?view=markup

As you will see, they are fairly comprehensive.

> - a modified version of saxutils, expecting the Uri module above to be
>   in the _xmlplus directory (ie importable as xml.Uri). I've refactored
>   prepare_input_source to ease testing of the URI merging stuff.

You might want to grep for "emacspymodestink" in your code. :)

> - a unittest file, which include some test cases for the URI merging
>   function. Please take a look at the existant test cases to check
>   everything looks fine to you. If you have other case to add, please let
>   me know (or maybe can I add this file to the cvs first). Notice that
>   to run the tests, you should have a "quotes.xml" file in the same
>   directory as the test file (there is one in the test directory of
>   pyxml). As a bonus, I've converted the escape function test from
>   test_utils into a unittest in the same file.
> 
> Anyway, having SplitUriRef/UnsplitUriRef replacing 
> urlparse.urlsplit/urlunsplit and Absolutize or BaseJoin replacing
> urlparse.urljoin would definitly be the right thing.

On python-dev in Sep 2004, I was discussing with Martin v. Löwi swhat 
principles we think should be embraced by urlparse, urllib and urllib2. He 
feels that we should simultaneously shoot for both URI and IRI support 
according to the RFCs (3986 and 3987), with unicode arguments being assumed to 
be IRIs.

I would hold off on any stdlib changes until the APIs can be discussed in 
more detail.


More information about the XML-SIG mailing list