[Chennaipy] Help in Pandas Script.

Krishna Sangeeth KS kskrishnasangeeth at gmail.com
Mon Oct 8 08:18:18 EDT 2018


Hey Lokesh,

** I just want to find the no.of duplicates in each file. *

See
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.duplicated.html

** If I have already checked the folder with 10 files, It shows some data
of duplicates. Again If I add some more files in that same folder , We need
to match the records .*

Okay this is not  super clear, but i assume you want to check if the newly
added records are already present in your existing dataframe. I would
recommend using `join` or `isin` operator if you are not having any compute
constraints.

Best,
Krishna Sangeeth K.S

----------------------------------------------------------------------

If you can force your heart and nerve and sinew
To serve your turn long after they are gone,
And so hold on when there is nothing in you
Except the Will which says to them: "Hold on"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/chennaipy/attachments/20181008/854bf1cb/attachment.html>


More information about the Chennaipy mailing list