Connecting Google News

Peter Otten __peter__ at web.de
Sun Jul 16 06:14:15 EDT 2017


Javier Bezos wrote:

> Google News used to fail with the high level functions provided by httplib
> and the like. However, I found this piece of code somewhere:
> 
>      def gopen():
>        http = httplib.HTTPSConnection('news.google.com')
>        http.request("GET","/news?ned=es_MX" ,

When you change that to

         http.request("GET","/news/headlines?ned=es_mx&hl=es" ,

you get a non-empty return. Most of the actual content seems to be buried in 
javascript though.

>                      headers =
>                     {"User-Agent":"Mozilla/5.0 (X11; U; Linux i686; es-MX)
> AppleWebKit/532.8 (KHTML, like Gecko) Chrome/4.0.277.0 Safari/532.8",
>                     "Host":'news.google.com',
>                     "Accept": "*/*"})
>        return http.getresponse()
> 
> A few days ago, Google News has been revamped and it doesn't work any more
> (2.6/Win7, 2.7/OSX and, with minimal changes, 3.6/Win7), because the page
> contents is empty. The code itself doesn't raise any errors. Which is the
> proper way to do it now? I must stick to the standard libraries.





More information about the Python-list mailing list