[Edu-sig] CP4E (continued)
Kirby Urner
urnerk at qwest.net
Sat Apr 9 20:49:06 CEST 2005
Projected in front of class (teacher explaining her process for grabbing
cities and states, making a simple plaintext file for later reuse):
>>> import urllib2
>>> fo = urllib2.urlopen(
"http://www.w3.org/2000/10/swap/test/dbork/data/USRegionState.daml")
>>> fo # <-- a file-like object
<addinfourl at 21988896 whose fp = <socket._fileobject object at
0x00CA1490>>
>>> for i in fo: # grab the strings we'll need to parse
if '<capital' in i:
allcapitals.append(i)
>>> fo.close()
>>> def getcities(): # snip off the fat
cities = []
for city in allcapitals:
st = city.find("#")
fn = city.find('"/>')
cities.append(city[st+1:fn])
return cities
>>> def getcitystate(): # separate city and state
citystate=[]
global cities
for e in cities:
city = e[:-2]
state = e[-2:]
citystate.append((city,state))
return citystate
>>> cs = getcitystate()
>>> cs
[('montgomery', 'al'), ('juneau', 'ak'), ('phoenix', 'az'), ('littlerock',
'ar'), ('sacramento', 'ca'), ('denver', 'co'), ('hartford', 'ct'),
('washington', 'dc'), ('dover', 'de'), ('tallahassee', 'fl'), ('atlanta',
'ga'), ('honolulu', 'hi'), ('boise', 'id'), ('springfield', 'il'),
('indianapolis', 'in'), ('desmoines', 'ia'), ('topeka', 'ks'), ('frankfort',
'ky'), ('batonrouge', 'la'), ('augusta', 'me'), ('annapolis', 'md'),
('boston', 'ma'), ('lansing', 'mi'), ('stpaul', 'mn'), ('jackson', 'ms'),
('jeffersoncity', 'mo'), ('helena', 'mt'), ('lincoln', 'ne'), ('carsoncity',
'nv'), ('concord', 'nh'), ('trenton', 'nj'), ('santafe', 'nm'), ('albany',
'ny'), ('raleighdurham', 'nc'), ('bismarck', 'nd'), ('columbus', 'oh'),
('oklahomacity', 'ok'), ('salem', 'or'), ('harrisburg', 'pa'),
('providence', 'ri'), ('columbia', 'sc'), ('pierre', 'sd'), ('nashville',
'tn'), ('austin', 'tx'), ('saltlakecity', 'ut'), ('montpelier', 'vt'),
('richmond', 'va'), ('olympia', 'wa'), ('charleston', 'wv'), ('madison',
'wi'), ('cheyenne', 'wy')]
>>>
etc. (still have to hand-space 'jefferson city' etc. once you get your
plaintext written out (hey, we humans have a role to play, why not?).
Actually, this whole exercise might be fun for the kids to mess with -- I'll
forward it to the police in Hillsboro as a lab activity (Red Hat 9 lab, West
Precinct).
We could show 'em XML parsing and regular expressions (alternative, more
sophisticated ways to suck strings) in later lessons.
Kirby
More information about the Edu-sig
mailing list