[scikit-learn] transform categorical data to numerical representation

Georg Heiler georg.kf.heiler at gmail.com
Sat Aug 5 05:10:57 EDT 2017


Hi,

the LabelEncooder is only meant for a single column i.e. target variable.
Is the DictVectorizeer or a manual chaining of multiple LabelEncoders (one
per categorical column) the desired way to get values which can be fed into
a subsequent classifier?

Is there some way I have overlooked which works better and possibly also
can handle unseen values by applying most frequent imputation?

regards,
Georg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170805/308522c4/attachment.html>


More information about the scikit-learn mailing list