Pandas cat.categories.isin list, is this a bug?

zljubisic at gmail.com zljubisic at gmail.com
Mon May 14 07:18:31 EDT 2018


On Monday, 14 May 2018 13:05:24 UTC+2, zlju... at gmail.com  wrote:
> Hi,
> 
> I have dataframe with CRM_assetID column as category dtype:
> 
> df.info()
> 
> <class 'pandas.core.frame.DataFrame'>
> RangeIndex: 1435952 entries, 0 to 1435951
> Data columns (total 75 columns):
> startTime                            1435952 non-null object
> CRM_assetID                          1435952 non-null category
> 
> searching a dataframe for each of three categories:
> 
> df[df.CRM_assetID == 'V1254748'].shape
> (35, 75)
> df[df.CRM_assetID == 'V805722'].shape
> (45, 75)
> df[df.CRM_assetID == 'V1105400'].shape
> (34, 75)
> 
> 
> len(df.CRM_assetID.cat.categories.isin(['V1254748', 'V805722', 'V1105400']))
> 
> Why this len is not equal to 114 (35 + 45 + 34)?
> 
> Regards.

I forgot to copy result of:

len(df.CRM_assetID.cat.categories.isin(['V1254748', 'V805722', 'V1105400'])) 

which is 55418.



More information about the Python-list mailing list