[newbie] confusion concerning fetching an element in a 2d-array

Peter Otten __peter__ at web.de
Tue Mar 25 12:12:12 EDT 2014


Jean Dubois wrote:

> Op dinsdag 25 maart 2014 12:01:37 UTC+1 schreef Steven D'Aprano:
>> On Tue, 25 Mar 2014 03:26:26 -0700, Jean Dubois wrote:
>>
>> > I'm confused by the behaviour of the following python-script I wrote:
>> > 
>> > #!/usr/bin/env python
>> > #I first made a data file 'test.dat' with the following content
>> > #1.0 2 3
>> > #4 5 6.0
>> > #7 8 9
>> > import numpy as np
>> > lines=[line.strip() for line in open('test.dat')]
>> > #convert lines-list to numpy-array
>> > array_lines=np.array(lines)
>> > #fetch element at 2nd row, 2nd column:
>> > print array_lines[1, 1]
>> > 
>> > 
>> > When running the script I always get the following error: IndexError:
>> > invalid index
>> > 
>> > Can anyone here explain me what I am doing wrong and how to fix it?
>>
>> Yes. Inspect the array by printing it, and you'll see that it is a one-
>> dimensional array, not two, and the entries are strings:
>>
>>
>> py> import numpy as np
>> py> # simulate a text file
>> ... data = """1.0 2 3
>> ... 4 5 6.0
>> ... 7 8 9"""
>> py> lines=[line.strip() for line in data.split('\n')]
>> py> # convert lines-list to numpy-array
>> ... array_lines = np.array(lines)
>> py> print array_lines
>> ['1.0 2 3' '4 5 6.0' '7 8 9']
>>
>>
>> The interactive interpreter is your friend! You never need to guess what
>> the problem is, Python has powerful introspection abilities, one of the
>> most powerful is also one of the simplest: print. Another powerful tool
>> in the interactive interpreter is help().
>>
>> So, what to do about it? Firstly, convert your string read from a file
>> into numbers, then build your array. Here's one way:
>>
>> py> values = [float(s) for s in data.split()]
>> py> print values
>> [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
>> py> array_lines = np.array(values)
>> py> array_lines = array_lines.reshape(3, 3)
>> py> print array_lines
>> [[ 1.  2.  3.]
>>  [ 4.  5.  6.]
>>  [ 7.  8.  9.]]
>>
> Dear Steve,
> Thanks for answering my question but unfortunately now I'm totally
> confused.
> Above I see parts from different programs which I can't
> assemble together to one working program (I really tried hard).
> Can I tell from your comment I shouldn't use numpy?
> I also don't see how to get the value an element specified by (row,
> column) from a numpy_array like "array_lines" in my original code
> 
> All I need is a little python-example reading a file with e.g. three lines
> with three numbers per line and putting those numbers  as floats in a
> 3x3-numpy_array, then selecting an element from that numpy_array using
> it's row and column-number.

I'll try, too, but be warned that I'm using the same methology as Steven. 
Try to replicate every step in the following exploration.

First let's make sure we start with the same data:

$ cat test.dat
1.0 2 3
4 5 6.0
7 8 9

Then fire up the interactve interpreter:

$ python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08) 
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> lines = [line.strip() for line in open("test.dat")]
>>> lines
['1.0 2 3', '4 5 6.0', '7 8 9']

As you can see lines is a list of three strings.
Let's break these strings into parts:

>>> cells = [line.split() for line in lines]
>>> cells
[['1.0', '2', '3'], ['4', '5', '6.0'], ['7', '8', '9']]

We now have a list of lists of strings and you can address individual items 
with

>>> cells[1][2]
'6.0'

What happens when pass this list of lists of strings to the numpy.array() 
constructor?

>>> a = numpy.array(cells)
>>> a
array([['1.0', '2', '3'],
       ['4', '5', '6.0'],
       ['7', '8', '9']], 
      dtype='|S3')
>>> a[1,2]
'6.0'

It sort of works, but the array entries are strings rather than floating 
point numbers. Let's fix that:

>>> a = numpy.array(cells, dtype=float)
>>> a
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])
>>> a[1,2]
6.0

OK, now we can put the previous steps into a script:

$ cat tmp.py
import numpy
cells = [line.split() for line in open("test.dat")]
a = numpy.array(cells, dtype=float)
print a[1, 2]

Run it:
$ python tmp.py
6.0

Seems to work. But reading a 2D array from a file really looks like a common  
task -- there should be a library function for that:

$ python
Python 2.7.5+ (default, Feb 27 2014, 19:37:08) 
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.loadtxt("test.dat")
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.]])






More information about the Python-list mailing list