urllib question. (asseumbling an HTTP post request)
Omri Schwarz
ocscwar at h-after-ocsc.mit.edu
Tue Mar 26 15:08:50 EST 2002
Hi, all,
I'm trying to parse an HTML form in Python,
assemble a POST and send it off. Right now, here is
what the script I have does:
# a regex to gather the inputs:
findinput = re.compile('\<input type="(.*?)" *name="(.*?)" *value="(.*?)" *\>|\<input *type="(.*?)" *name="(.*?)" *(.*?)\>')
# then later on:
c = urllib.urlopen(b).read()
subject = findsubject.search(c)
if subject:
print "Is this spam?\n"
print subject.groups()[0]
if (raw_input()[0] =='y') :
# here's the part that may be going wrong:
for cb in findinput.findall(c):
print cb
if cb[3] == 'checkbox':
if cb[5] == 'checked' :
poststring[cb[4]]= 'on'
if cb[0] == 'hidden':
poststring[cb[1]]=cb[2]
d = urllib.urlencode(poststring)
print d
res = urllib.urlopen('http://www.spamcop.net/sc',d).read()
The form has hidden inputs, checkboxes,
and the submit button.
When I use a browser (Netscape or Lynx),
everything is fine. When I try to use this
script, I get server side errors that are
no help to me.
So, has anyone already written a form parser
for this purpose? Can anyone think of other problems
that might be happening?
Thanks, in advance.
--
Omri Schwarz --- ocscwar at mit.edu ('h' before war)
Timeless wisdom of biomedical engineering: "Noise is principally
due to the presence of the patient." -- R.F. Farr
More information about the Python-list
mailing list