python financial data cleaning

Sebastian M Cheung minscheung at googlemail.com
Mon Jun 15 17:01:31 EDT 2015


On Monday, June 15, 2015 at 11:19:48 AM UTC+1, Mark Lawrence wrote:
> On 15/06/2015 11:12, Sebastian M Cheung via Python-list wrote:
> > How to do financial data cleaning ? Say I assume a list of 1000 finance series data in myList = Open, High, Low and Close. For missing Close Price data, What is best practice to clean data in Python
> >
> 
> http://pandas.pydata.org/
> 
> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence


Hi Mark,

Below I read in DirtyData (financial data) from Excel and then find the number of NaN missing Closed Pricing data:

xls = pd.ExcelFile('DirtyData.xlsm')
df = xls.parse('Dirty Data', index_col=None, na_values=['NA'])
print(df.isnull().astype(int).sum()) 

So if I were to clean missing Open Price data, I could copy from previous or row's Close Price data, but how would I implement it? Thanks




More information about the Python-list mailing list