[scikit-learn] Any recommend way to encode IP address?

Chris Aridas chris at aridas.eu
Fri Aug 16 03:54:27 EDT 2019


 Hey,

Apart from encoding you could use feature engineering. Something like this
https://ipgeolocation.io/documentation/ip-geolocation-api.html
Two IPs might have the same country but different city. So, you could mix
and match whatever you want.

Best,

On Fri, Aug 16, 2019 at 10:46 AM lampahome <pahome.chen at mirlab.org> wrote:

> I collect data which has many access log from different IP.
>
> But I don't know what's the better way to encode it to make sure small
> size of train data and keep the independency of different IPs.
>
> 1. one-hot encode: If too many IP, the train data will occupy huge disk
> spaces.
> 2. category encode: IP will be encoded to 0~N, but can't show the relation
> between different IPs.
>
> anyone have advices?
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190816/99aa2d5d/attachment.html>


More information about the scikit-learn mailing list