[Numpy-discussion] Numpy 2D array from a list error

Dave Wood davejwood at gmail.com
Wed Sep 23 12:56:20 EDT 2009


Appologies for the multiple posts, people. My posting to the forum was
pending for a long time, so I deleted it and tried emailing directly. I
didn't think they'd all be sent out.
Gokan, thanks for the reply, I hope you get this one.

"Here I use loadtxt to read ~89 MB txt file. Can you use loadtxt and share
your results?

I[14]: data = np.loadtxt('09_03_18_07_55_33.sau', dtype='float',
skiprows=83).T

I[15]: len data
-----> len(data)
O[15]: 66

I[16]: len data[0]
-----> len(data[0])
O[16]: 117040

I[17]: whos
Variable   Type        Data/Info
--------------------------------
data       ndarray     66x117040: 7724640 elems, type `float64`, 61797120
bytes (58 Mb)



[gsever at ccn various]$ python sysinfo.py
================================================================================
Platform     :
Linux-2.6.29.6-217.2.3.fc11.i686.PAE-i686-with-fedora-11-Leonidas
Python       : ('CPython', 'tags/r26', '66714')
IPython      : 0.10
NumPy        : 1.4.0.dev
Matplotlib   : 1.0.svn
================================================================================


-- 
Gökhan"




I tried using loadtxt and got the same error as before (with a little more
information).

"

Traceback (most recent call last):
  File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 140,
in <module>
    main()
  File "/home/dwood/workspace/GeneralScripts/src/test_clab2R.py", line 45,
in main
    data = loadtxt("inputfile.txt",dtype='string')
  File
"/apps/python/2.5.4/rhel4/lib/python2.5/site-packages/numpy/lib/io.py", line
505, in loadtxt
    X = np.array(X, dtype)
ValueError: setting an array element with a sequence
"

@Christopher Barker
Thanks for the information. To fix my problem, I tried taking out the row
names (leaving only numerical information), and converting the 2D list to
floats. I still had the same problem.


On 9/23/09, Christopher Barker <Chris.Barker at noaa.gov> wrote:
>
> Dave Wood wrote:
> > Well, I suppose they are all considered to be strings here. I haven't
> > tried to convert the numbers to floats yet.
>
> This could be an issue. For strings, numpy creates an array of strings,
> all of the same length, so each element is as big as the largest one:
>
> In [13]: l
> Out[13]: ['5', '34', 'this is a much longer string']
>
> In [14]: np.array(l)
> Out[14]:
> array(['5', '34', 'this is a much longer string'],
>       dtype='|S28')
>
>
> Note that each element is 28 bytes (that's what the S28 means).
>
> this means that your array would be much larger than the text file if
> you have even one long string it in. Also, as mentioned in this thread,
> in order to figure out how big to make each string element, the array()
> constructor has to scan through your entire list first, and I don't know
> how much intermediate memory it may use in that process.
>
> This really isn't how numpy is meant to be used -- why would you want a
> big ol' array of mixed numbers and strings, all stored as strings?
>
> structured arrays were meant for this, and np.loadtxt() is the easiest
> way to get one.
>
> > I just tried preallocating the array and updating it one line at a time,
> > and that works fine.
>
> what dtype do you end up with?
>
> > This doesn't seem like the expected behaviour though and the error
> > message seems wrong.
>
> yes, not a good error message at all -- it's hard to make sure good
> errors get triggered every time!
>
>
> HTH,
>
> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20090923/684d1b48/attachment.html>


More information about the NumPy-Discussion mailing list