[Numpy-discussion] SParse feature vector generation
Samuel John
scipy at samueljohn.de
Tue Jan 10 10:24:41 EST 2012
I would just use a lookup dict:
names = [ "uc_berkeley", "stanford", "uiuc", "google", "intel", "texas_instruments", "bool"]
lookup = dict( zip( range(len(names)), names ) )
Now, given you have n entries:
S = numpy.zeros( (n, len(names)) ,dtype=numpy.int32)
for k in ["uc_berkeley", "google", "bool"]:
S[0,lookup[k]] += 1
for k in ["stanford", "intel","bool"]:
S[1,lookup[k]] += 1
... and so forth. so lookup[k] returns the index to use.
Hope this helps. I am not aware of an automatic that does this. I may be wrong.
cheers,
Samuel
On 04.01.2012, at 07:25, Dhruvkaran Mehta wrote:
> Hi numpy users,
>
> Is there a convenient way in numpy to go from "string" features like:
>
> "uc_berkeley", "google", 1
> "stanford", "intel", 1
> .
> .
> .
> "uiuc", "texas_instruments", 0
>
> to a numpy matrix like:
>
> "uc_berkeley", "stanford", ..., "uiuc", "google", "intel", "texas_instruments", "bool"
> 1 0 ... 0 1 0 0 1
> 0 1 ... 0 0 1 0 1
> :
> 0 0 ... 1 0 0 1 0
>
> I really appreciate you taking the time to help!
> Thanks!
> --Dhruv
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list