Rookie Speaks
Peter Otten
__peter__ at web.de
Thu Jan 8 04:58:48 EST 2004
William S. Perrin wrote:
I thinke your function has a sane design :-) XML is slow by design, but in
your case it doesn't really matter, because is probably I/O-bound, as
already pointed out by Samuel Walters.
Below is a slightly different approach, that uses a class:
class Weather(object):
def __init__(self, url=None, xml=None):
""" Will accept either a URL or a xml string,
preferrably as a keyword argument """
if url:
if xml:
# not sure what would be the right exception here
# (ValueError?), so keep it generic for now
raise Exception("Must provide either url or xml, not both")
sock = urllib.urlopen(url)
try:
xml = sock.read()
finally:
sock.close()
elif xml is None:
raise Exception("Must provide either url or xml")
self._dom = minidom.parseString(xml)
def getAttrFromDom(self, weatherAttribute):
a = self._dom.getElementsByTagName(weatherAttribute)
return a[0].firstChild.data
def asRow(self):
# this will defeat lazy attribute lookup
return "%13s\t%s\t%s\t%s\t%s\t%s\t%s" % (self.name,
self.fahrenheit, self.wind, self.barometric_pressure,
self.dewpoint, self.relative_humidity, self.conditions)
return
def __getattr__(self, name):
try:
value = self.getAttrFromDom(name)
except IndexError:
raise AttributeError(
"'%.50s' object has no attribute '%.400s'" %
(self.__class__, name))
# now set the attribute so it need not be looked up
# in the dom next time
setattr(self, name, value)
return value
This has a slight advantage if you are interested only in a subset of the
attributes, say the temperature:
for url in listOfUrls:
print Weather(url).fahrenheit
Here getAttrFromDom() - the equivalent of your getattrs() - is only called
once per URL. The possibility to print a tab-delimited row is still there,
print Weather(url).asRow()
but will of course defeat this optimization scheme.
Peter
More information about the Python-list
mailing list