MSIE6 Python Question

Ralph A. Gable r.gable at mchsi.com
Thu Jun 10 17:41:44 EDT 2004


All you guys have given me some super info. I have this problem under control
(with your help). I really appreciate it. Keep up the good work.
Ralph A. Gable

"Andy Baker" <andy at andybak.net> wrote in message news:<mailman.219.1085403362.6949.python-list at python.org>...
> The site might be checking your user-agent string. Urllib must allow  you to
> choose what browser to identify itself as. Simply match the user-agent of
> known version of IE and see if that works.
> 
> > -----Original Message-----
> > From: python-list-bounces+andy=andybak.net at python.org 
> > [mailto:python-list-bounces+andy=andybak.net at python.org] On 
> > Behalf Of Ralph A. Gable
> > Sent: 24 May 2004 12:25
> > To: python-list at python.org
> > Subject: Re: MSIE6 Python Question
> > 
> > "Kevin T. Ryan" <kevryan0701 at yahoo.com> wrote in message 
> > news:<40b1697d$0$3131$61fed72c at news.rcn.com>...
> > > Ralph A. Gable wrote:
> > > 
> > > > I'm a newbie at this but I need to control MSIE6 using Python. I 
> > > > have read the O'Reilly win32 python books and got some 
>  hints. But I 
> > > > need to Navigate to a site (which I know how to do) and 
>  then I need 
> > > > to get at the source code for that site inside Python (as 
>  when one 
> > > > used the
> > > > View|Source drop down window). Can anyone point me to 
>  some URLs that
> > > > would help out? Or just tell me how to do it? I would be very 
> > > > grateful.
> > > 
> > > I'm not sure why you need to go through IE, but maybe this will get 
> > > you into the right direction:
> > > 
> > > >>> import urllib
> > > >>> f = urllib.urlopen('http://www.python.org')
> > > >>> f.readline()
>  '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n'
> > > >>> f.readline()
>  '                      "http://www.w3.org/TR/html4/loose.dtd" >\n'
> > > >>>
> > > 
> > > You could do:
> > > for line in f:
> > >         process(line)
> > > 
> > > just like you can with a file.  Check the urllib, urllib2, 
>  and other 
> > > related modules (maybe httplib).  Hope that helps.
> > 
> > 
> > Sorry. I forgot to mention that I have tried that. The data I 
> > want is being stripped out when I access the URL via urllib. 
> > I CAN see the data when I go into IE and do view source but 
> > when I use urllib the site intentionally blanks out the 
> > information I want. For that reason, I would like to get it 
> > using IE6 if I can. If there are other ways to fake out the 
> > site, I would be interested in that also. I thought that 
> > perhaps the site was detecting the fact that I was not 
> > querying it using a browser. I tried putting that into into 
> > the HTTP messages but may not have done it right. At any rate 
> > couldn't get that to work. It may be that the site is using 
> > cookies to be sure someone is not getting the data. I haven't 
> > pursued that. Again that is another reason I wanted to use 
> > IE6 (since I know it works). The data is on a site to which I 
> > subscribe to a service. But the particular information is 
> > available to anyone if he/she types in the url (as long as 
> > you are using a browser).
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >



More information about the Python-list mailing list