[Tutor] weird lambda expression -- can someone help me understand how this works

Michael Crawford dalupus at gmail.com
Sat Dec 14 03:14:12 CET 2013


I found this piece of code on github

https://gist.github.com/kljensen/5452382

def one_hot_dataframe(data, cols, replace=False):
    """ Takes a dataframe and a list of columns that need to be encoded.
        Returns a 3-tuple comprising the data, the vectorized data,
        and the fitted vectorizor.
    """
    vec = DictVectorizer()
    mkdict = lambda row: dict((col, row[col]) for col in cols)  #<<<<<<<<<<<<<<<<<<
    vecData = pandas.DataFrame(vec.fit_transform(data[cols].apply(mkdict, axis=1)).toarray())
    vecData.columns = vec.get_feature_names()
    vecData.index = data.index
    if replace is True:
        data = data.drop(cols, axis=1)
        data = data.join(vecData)
    return (data, vecData, vec)

I don't understand how that lambda expression works.
For starters where did row come from?  
How did it know it was working on data?


Any help with understanding this would be appreciate.

And I tried the code out and it works exactly how it is supposed to.  I just don't understand how.

Thanks,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20131213/8ea24e08/attachment-0001.html>


More information about the Tutor mailing list