[Tutor] Wedscraping Yahoo API

Alan Gauld alan.gauld at yahoo.co.uk
Mon Feb 20 13:43:20 EST 2017


Please post in plain text. Formatting is very important in
Python and RTF or HTML tend to get scrambled in transit
making your code and error hard to read.

Thanks

Alan G.

On 20/02/17 14:32, Joe via Tutor wrote:
> Hi,
> I keep getting the following error as I am new to programming and I am following a tutorial and I am using Python 3.6. Can you please point me in the right direction as to what I am doing wrong. This is code I am running
> import datetime as dtimport matplotlib.pyplot as pltfrom matplotlib import style#from matplotlib.finance import candlestick_ohlcimport matplotlib.dates as mdatesimport pandas as pdimport pandas_datareader.data as webimport bs4 as bsimport pickleimport requestsimport osimport csvimport numpy as npimport time
> style.use('ggplot')
> def save_tsx_tickers():    resp = requests.get('http://web.tmxmoney.com/indices.php?section=tsx&index=%5ETSX')    soup = bs.BeautifulSoup(resp.text, "lxml")    table = soup.find('table', {'class': 'indices-table'})    tickers = []    for row in table.findAll('tr')[1:]:        ticker = row.findAll('td')[1].text        tickers.append(ticker.replace(".","-") + ".TO")
>     with open("tsxtickers.pickle", "wb") as f:        pickle.dump(tickers, f)
>     #print(tickers)        return tickers
> def get_data_from_yahoo(reload_tsx = False):    if reload_tsx:        tickers = save_tsx_tickers()    else:        with open("tsxtickers.pickle", "rb") as f:            tickers = pickle.load(f)
>     if not os.path.exists('stock_dfs'):        os.makedirs('stock_dfs')
>     start = dt.datetime(2000, 1, 1)    end = dt.datetime(2016, 12, 31)
>     for i in tickers:        if not os.path.exists('stock_dfs/{}.csv'.format(i)):            time.sleep(2)            df = web.DataReader(i, 'yahoo', start, end)            df.to_csv('stock_dfs/{}.csv'.format(i))        else:            print('Already have {}'.format(i))
> 
> 
> However, I keep getting this error. Please help in identifying the problem. Thank you.
> 
> 
> Traceback (most recent call last):  File "<pyshell#3>", line 1, in <module>    get_data_from_yahoo()  File "C:\Users\Joe\Desktop\joe\Tutorial Python.py", line 50, in get_data_from_yahoo    df = web.DataReader(i, 'yahoo', start, end)  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\data.py", line 116, in DataReader    session=session).read()  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\yahoo\daily.py", line 76, in read    df = super(YahooDailyReader, self).read()  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\base.py", line 155, in read    df = self._read_one_data(self.url, params=self._get_params(self.symbols))  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\base.py", line 74, in _read_one_data    out = self._read_url_as_StringIO(url, params=params)  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\base.py", line 85, in _read_url_as_StringIO    response = self._get_response(url, params=params)  File "C:\Users\Joe\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas_datareader\base.py", line 120, in _get_response    raise RemoteDataError('Unable to read URL: {0}'.format(url))pandas_datareader._utils.RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv?s=OTEX.TO&a=0&b=1&c=2000&d=11&e=31&f=2016&g=d&ignore=.csv
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
> 




More information about the Tutor mailing list