Changes to urlparse.py: urljoin

Ronald Hiller ron at graburn.com
Sun Feb 13 18:09:47 EST 2000


I've been having some problems with the urljoin function.  When I try
and join URLs that have '..' components that make the path above the
root, they aren't joined properly.

For example:

goof> python
Python 1.5.2 (#1, Oct 24 1999, 20:24:11)  [GCC 2.8.1] on sunos5
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import urlparse
>>> urlparse.urljoin("http://www.xyz.com", "../x/y/z.gif")
'http://www.xyz.com/../x/y/z.gif'
>>>

# Now with the changes:
> python
Python 1.5.2 (#1, Oct 24 1999, 20:24:11)  [GCC 2.8.1] on sunos5
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import urlparse
>>> urlparse.urljoin("http://www.xyz.com", "../x/y/z.gif")
'http://www.xyz.com/x/y/z.gif'
>>>

My patches for urlparse are included below...do they look reasonable?
What is the process for getting these into the "real" source tree?

Thanks,
Ron

*** orig/Lib/urlparse.py Thu Mar 18 10:10:44 1999
--- urlparse.py Sun Feb 13 16:51:36 2000
***************
*** 166,171 ****
--- 166,175 ----
                        i = i+1
                else:
                        break
+       while segments[0] == '':
+               del segments[0]
+       while segments[0] == '..':
+               del segments[0]
        if len(segments) == 2 and segments[1] == '..' and segments[0] ==
'':
                segments[-1] = ''
        elif len(segments) >= 2 and segments[-1] == '..':






More information about the Python-list mailing list