[wwwsearch-general] ClientForm request re ParseErrors

bruce bedouglas at earthlink.net
Sun Jul 9 21:34:21 EDT 2006


update.....

out of curiosity, i fetched the latest mechanize from svn.. i get the same
error with the parse...

i've also tried to do:
 br.select_form(nr = 1)
 br.select_form(name="foo")
 br.select_form(name=foo)
 br.select_form(name="foo")

etc.... same err occurs...

-bruce



hi john...

not sure exactly who i should talk to tabout this..but here goes...

i have the following piece of code... i'm trying to do a select form, and my
test throws an error...

i have the actual form "main" in the html, so it should find it... as far as
i can tell, i've followed the docs.. but i could be wrong. any thoughts?

the code, output, and partial html is below...

thoughts/comments/ideas/etc...

thanks

-bruce

test code
------------------------
  #get the semester page
  #get the 2nd semester/frame src url page
  br.open(url)
  response = br.response()  # this is a copy of response
  s = response.read()
  print response.read()
  print s
  #we now have the semester page...
  d = libxml2dom.parseString(s, html=1)
  ff = d.xpath(fnamepath)
  fname = ff[0].nodeValue
  print "fname = ",fname
  br.select_form(name="main")<<<<<<<<<<<<<<< error happens....


output
------------------------
fname =  main
Traceback (most recent call last):
  File "./stest.py", line 156, in ?
    br.select_form(name="main")
  File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 352, in
select_form
  File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 296, in
forms
  File "build/bdist.linux-i686/egg/mechanize/_html.py", line 510, in forms
  File "build/bdist.linux-i686/egg/mechanize/_html.py", line 226, in forms
  File "build/bdist.linux-i686/egg/ClientForm.py", line 922, in
ParseResponse
  File "build/bdist.linux-i686/egg/ClientForm.py", line 952, in ParseFile
  File "/usr/lib/python2.4/sgmllib.py", line 95, in feed
    self.goahead(0)
  File "/usr/lib/python2.4/sgmllib.py", line 165, in goahead
    k = self.parse_declaration(i)
  File "/usr/lib/python2.4/markupbase.py", line 89, in parse_declaration
    decltype, j = self._scan_name(j, i)
  File "/usr/lib/python2.4/markupbase.py", line 378, in _scan_name
    self.error("expected name token at %r"
  File "/usr/lib/python2.4/sgmllib.py", line 102, in error
    raise SGMLParseError(message)
sgmllib.SGMLParseError: expected name token at '<! Others/0/WIN; Too'


partial html
-----------------------------------
</table>
<br />
<FORM NAME='main' METHOD=POST
Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVI
CES&Market=GBL&PanelGroupName=CLASS_SEARCH"  autocomplete=off>
<INPUT TYPE=hidden NAME=ICType VALUE=Panel>
<INPUT TYPE=hidden NAME=ICElementNum VALUE="0">
<INPUT TYPE=hidden NAME=ICStateNum VALUE="1">







-----Original Message-----
From: wwwsearch-general-bounces at lists.sourceforge.net
[mailto:wwwsearch-general-bounces at lists.sourceforge.net]On Behalf Of
John J Lee
Sent: Sunday, July 09, 2006 9:51 AM
To: wwwsearch-general at lists.sf.net
Subject: Re: [wwwsearch-general] ClientForm request re ParseErrors


On Sun, 9 Jul 2006, Titus Brown wrote:
[...]
> Define "better patch"...?  The code I sent out before lets ClientForm
> parse otherwise unparseable HTML, and it works fine.  I suppose it's
> less elegant than having two separate while loops; is that what you
> mean?

No, I just hate going one char at a time in Python.  Surely this should be
fixed somewhere else?  (I'm not sure where; I haven't looked recently)

If you've determined that fixing it elsewhere pulls in too much code or
requires a fix to stdlib code (if so, why?), maybe I should do as you
suggest anyway, but I don't like it.


> -> > The problem I have is that there's literally no way to pass
> -> > configuration parameters like 'ignore_errors' down from the
> -> > mechanize.Factory.forms() call.
> ->
> -> You can reimplement FormsFactory.  It's a trivial (if slightly verbose)
> -> class, right?
>
> I could do that, yes.  But I'd also need to redefine Factory.forms(),
> too, which calls FormsFactory.

Why?  You can supply your own FormsFactory, as DefaultFactory does.

[...]
> -> > Separately, it'd be nice if ignore_errors wasn't hardcoded as False
in
> -> > ParseFile ;).
> ->
> -> I'm not sure what you want here.  Could you send a patch?
>
> Line 914 of ClientForm.py should be changed to 'ignore_errors,'

Oh.  Sure, if I apply a patch to enable ignore_errors, I'll of course do
that too.


John



-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job
easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
wwwsearch-general mailing list
wwwsearch-general at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wwwsearch-general




More information about the Python-list mailing list