URL 'special character' replacements

Tim N. van der Leeuw tim.leeuwvander at nl.unisys.com
Mon Jan 9 08:27:06 EST 2006


My outline for a solution would be:

- Use StringIO or cStringIO for reading the original URLs character for
character, and to build the result URLs character for character

- When you read a '%' then read the next 2 character (should be
digits!!!) and create a new string with them
- The numbers like '20' etc. are hexadecimal values, meaning integers
with base 16.
  Get the actual int-value like this:
  code_int = int(code_str, 16)
- Convert to character as: code_chr = chr(code_int)
- Write this character to the output cStringIO buffer
- When the whole URL is done, do getvalue() to get the string of the
new URL and close the cStringIO buffer.

Is that sufficiently comprehensible? Or still too convoluted for you?

(PS: I researched doing it the manual way, 'the hard way'. However,
there are plenty of libraries in Python for all sorts of internet
stuff. Perhaps urllib or urllib2 already has the functionality that you
need -- didn't look it up)

cheers,

--Tim




More information about the Python-list mailing list