[XML-SIG] prepare_input_source and relative path

Mike Brown mike at skew.org
Wed Feb 9 03:01:35 CET 2005


Sylvain Thénault wrote:
> I guess you're right. I wrote this patch because it was fixing my
> problem. Now if it doesn't take too much time to have every cases
> correctly fixed by implementing RFC 3986, I may take some time to do so
> or to help having it done. And if parts of the job is already done in
> 4suite, that's great. However what's in 4suite, what's not and need to
> be implemented is not yet clear to me.

The current version of Ft.Lib.Uri is here:
http://cvs.4suite.org/viewcvs/4Suite/Ft/Lib/Uri.py?view=markup [1]

If you see "rfc2396bis" in the doc strings, you may safely interpret
them to mean "RFC 3986".


The functions that you should look at are the following:

MakeUrllibSafe(uriRef)
======================
This exists in order to convert a proper URI reference into one that
can be handled by urllib.urlopen(). It does the following:
1. If the reference contains an Internationalized Domain Name,
   recodes it so that it is resolvable. (Py 2.3+ only)
2. Strips the fragment component, if any. 
3. Ensures that the reference is a byte string, not unicode.
4. On Windows, assumes that the first ':' appearing in the path
   component is part of a drivespec, and converts it to '|'.

If you port this function, the reference to PercentDecode() may be replaced 
with urllib.unquote(), but you must move the byte string check (#3, above) to 
occur before calling unquote. The references to the functions SplitUriRef and 
UnsplitUriRef can be replaced with urlsplit() and urlunsplit() from the 
urlparse module.


Absolutize(uriRef, baseUri)
===========================
This does strict merging of a URI reference and a base URI. The base URI 
*must* be absolute (must have a scheme). If you port this function, the
UriException may be replaced with a ValueError, and SplitUriRef &
UnsplitUriRef may be replaced with their urlparse equivalents, as
mentioned above. The RemoveDotSegments function must also be ported and
should be made semi-private because it is not for general use. I've
implemented it using two segment stacks, as alluded to in the spec,
rather than the explicit string-walking algorithm that would be too
inefficient.


BaseJoin(base, UriRef)
======================
This does lenient merging of a base URI and a URI reference (note the
argument order is different than that of Absolutize). It allows the base
URI to be a relative reference. In such cases, we use a dummy scheme
(we don't say "assume 'file:' because the spec says all schemes must be
resolved the same), run it through Absolutize, and then remove the scheme
from the result. If you port this function, you will need to port the
IsAbsolute function, which just checks to see if the URI has a scheme.
I prefer to use a regex for this, as it is fast and accurate (':' can
appear in more than one place in a URI reference, so it is not safe to
assume that its presence means there is a scheme).


-Mike

  [1] ...well, not really. The current version is on my hard drive :)



More information about the XML-SIG mailing list