MSIE6 Python Question
Ralph A. Gable
r.gable at mchsi.com
Thu Jun 10 17:41:44 EDT 2004
All you guys have given me some super info. I have this problem under control
(with your help). I really appreciate it. Keep up the good work.
Ralph A. Gable
"Andy Baker" <andy at andybak.net> wrote in message news:<mailman.219.1085403362.6949.python-list at python.org>...
> The site might be checking your user-agent string. Urllib must allow you to
> choose what browser to identify itself as. Simply match the user-agent of
> known version of IE and see if that works.
>
> > -----Original Message-----
> > From: python-list-bounces+andy=andybak.net at python.org
> > [mailto:python-list-bounces+andy=andybak.net at python.org] On
> > Behalf Of Ralph A. Gable
> > Sent: 24 May 2004 12:25
> > To: python-list at python.org
> > Subject: Re: MSIE6 Python Question
> >
> > "Kevin T. Ryan" <kevryan0701 at yahoo.com> wrote in message
> > news:<40b1697d$0$3131$61fed72c at news.rcn.com>...
> > > Ralph A. Gable wrote:
> > >
> > > > I'm a newbie at this but I need to control MSIE6 using Python. I
> > > > have read the O'Reilly win32 python books and got some
> hints. But I
> > > > need to Navigate to a site (which I know how to do) and
> then I need
> > > > to get at the source code for that site inside Python (as
> when one
> > > > used the
> > > > View|Source drop down window). Can anyone point me to
> some URLs that
> > > > would help out? Or just tell me how to do it? I would be very
> > > > grateful.
> > >
> > > I'm not sure why you need to go through IE, but maybe this will get
> > > you into the right direction:
> > >
> > > >>> import urllib
> > > >>> f = urllib.urlopen('http://www.python.org')
> > > >>> f.readline()
> '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n'
> > > >>> f.readline()
> ' "http://www.w3.org/TR/html4/loose.dtd" >\n'
> > > >>>
> > >
> > > You could do:
> > > for line in f:
> > > process(line)
> > >
> > > just like you can with a file. Check the urllib, urllib2,
> and other
> > > related modules (maybe httplib). Hope that helps.
> >
> >
> > Sorry. I forgot to mention that I have tried that. The data I
> > want is being stripped out when I access the URL via urllib.
> > I CAN see the data when I go into IE and do view source but
> > when I use urllib the site intentionally blanks out the
> > information I want. For that reason, I would like to get it
> > using IE6 if I can. If there are other ways to fake out the
> > site, I would be interested in that also. I thought that
> > perhaps the site was detecting the fact that I was not
> > querying it using a browser. I tried putting that into into
> > the HTTP messages but may not have done it right. At any rate
> > couldn't get that to work. It may be that the site is using
> > cookies to be sure someone is not getting the data. I haven't
> > pursued that. Again that is another reason I wanted to use
> > IE6 (since I know it works). The data is on a site to which I
> > subscribe to a service. But the particular information is
> > available to anyone if he/she types in the url (as long as
> > you are using a browser).
> > --
> > http://mail.python.org/mailman/listinfo/python-list
> >
More information about the Python-list
mailing list