[Tutor] Change datatype for specific columns in an 2D array & computing the mean

Peter Otten __peter__ at web.de
Sun Jan 24 05:22:19 EST 2016


Ek Esawi wrote:

> Hi All---
> 
> 
> 
> Sorry for posting again, but I have a problem that I tried several
> different ways to solve w/o success. I approached the problem from one
> angle and asked about it here; I got some good input using pandas, and
> structured array, but I am new to python and not very familiar with either
> to use at this moment.  I decided to go about in a different direction. I
> am hoping for a simpler solution using Numpy.
> 
> 
> I have a csv file with 4 columns and 2000 rows. There are 10 variables in
> column 1 and 4 variables on each column, 2 and 3. I read the csv file and
> converted it to arrays. The problem I ran into and could not resolve is
> 2-fold: (1) change the datatype for columns 1 and 4 to float and (2) then,
> I want to use Numpy-or simpler method- to calculate the mean of the data
> points on column 4 based on each variable on column 1 and column 2. Below
> is my code and sample data file.
> 
> 
> 
> Here is part of my code:
> 
> 
> 
> import numpy as np
> 
> import csv
> 
> 
> 
> TMatrix=[]
> 
> np.set_printoptions(precision=2)
> 
> 
> 
> " Converting csv to lists "
> 
> 
> 
> with open('c:/Users/My Documents/AAA/temp1.csv') as temp:
> 
>     reader = csv.reader(temp, delimiter=',', quoting=csv.QUOTE_NONE)
> 
>     for row in reader:
> 
>         TMatrix.append(row)
> 
> 
> 
> " converting lists to arrays "
> 
> TMatrix=np.array(TMatrix)
> 
> TMatrix=np.array(4,TMatrix[1:,::],dtype='float,int,int,float')        #
> this statement is not working
> 
> 
> 
> +++++++++++++++ This is a sample of my file +++++++++++++
> 
> 
> 
> ['19' 'A4' 'B2' '2']
> 
>  ['19' 'A5' 'B1' '12']
> 
>  ['18' 'A5' 'B2' '121']]

How do you want to convert the second and third column to int? Are A4 and B2 
hex numbers? Then try

$ cat esawi.csv 
19,A4,B2,2
19,A5,B1,12
$ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29) 
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> def fromhex(s):
...     return int(s, 16)
... 
>>> numpy.genfromtxt("esawi.csv", delimiter=",", converters={1:fromhex, 
2:fromhex})
array([(19.0, 164, 178, 2.0), (19.0, 165, 177, 12.0)], 
      dtype=[('f0', '<f8'), ('f1', '<i8'), ('f2', '<i8'), ('f3', '<f8')])




More information about the Tutor mailing list