[Numpy-discussion] How to start at line # x when using numpy.memmap

Warren Weckesser warren.weckesser at enthought.com
Fri Aug 19 11:23:34 EDT 2011


On Fri, Aug 19, 2011 at 10:09 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse at gmail.com>
> wrote:
> > On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com>
> wrote:
> >> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
> >>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
> >>>> I would like to use numpy's memmap on some data files I have. The
> first
> >>>> 12 or so lines of the files contain text (header information) and the
> >>>> remainder has the numerical data. Is there a way I can tell memmap to
> >>>> skip a specified number of lines instead of a number of bytes?
> >>>
> >>> First use standard Python I/O functions to determine the number of
> >>> bytes to skip at the beginning and the number of data items. Then pass
> >>> in `offset` and `shape` parameters to numpy.memmap.
> >>
> >> Thanks for that suggestion. However, I'm unfamiliar with the I/O
> >> functions you are referring to. Can you point me to do the
> >> documentation?
> >>
> >> Thanks again,
> >> Jeremy
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> > this might get you started:
> >
> >
> > import numpy as np
> >
> > # make some fake data with 12 header lines.
> > with open('test.mm', 'w') as fhw:
> >    print >> fhw, "\n".join('header' for i in range(12))
> >    np.arange(100, dtype=np.uint).tofile(fhw)
> >
> > # use normal python io to determine of offset after 12 lines.
> > with open('test.mm') as fhr:
> >    for i in range(12): fhr.readline()
> >    offset = fhr.tell()
> >
> > # use the offset in your call to np.memmap.
> > a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)
>
> Thanks, that looks good. I tried it, but it doesn't get the correct
> data. I really don't understand what is going on. A simple code and
> sample data is attached if anyone has a chance to look at it.
>


Your data file is all text.  memmap is generally for binary data; it won't
work with this file.

Warren



>
> Thanks,
> Jeremy
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/0c98c280/attachment.html>


More information about the NumPy-Discussion mailing list