[Numpy-discussion] ANN: Numpy 1.6.0 beta 2

Tue Apr 5 13:56:17 EDT 2011

On Tue, Apr 5, 2011 at 11:45 AM, <josef.pktd at gmail.com> wrote:

> On Tue, Apr 5, 2011 at 1:20 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Tue, Apr 5, 2011 at 10:46 AM, Christopher Barker <
> Chris.Barker at noaa.gov>
> > wrote:
> >>
> >> On 4/4/11 10:35 PM, Charles R Harris wrote:
> >> >     IIUC, "Ub" is undefined -- "U" means universal newlines, which
> makes
> >> > no
> >> >     sense when used with "b" for binary. I looked at the code a ways
> >> > back,
> >> >     and I can't remember the resolution order, but there isn't any
> >> > checking
> >> >     for incompatible flags.
> >> >
> >> >     I'd expect that genfromtxt, being txt, and line oriented, should
> use
> >> >     'rU'. but if it wants the raw line endings (why would it?) then rb
> >> >     should be fine.
> >> >
> >> >
> >> > "U" has been kept around for backwards compatibility, the python
> >> > documentation recommends that it not be used for new code.
> >>
> >> That is for  3.*  -- the 2.7.* docs say:
> >>
> >> """
> >> In addition to the standard fopen() values mode may be 'U' or 'rU'.
> >> Python is usually built with universal newline support; supplying 'U'
> >> opens the file as a text file, but lines may be terminated by any of the
> >> following: the Unix end-of-line convention '\n', the Macintosh
> >> convention '\r', or the Windows convention '\r\n'. All of these external
> >> representations are seen as '\n' by the Python program. If Python is
> >> built without universal newline support a mode with 'U' is the same as
> >> normal text mode. Note that file objects so opened also have an
> >> attribute called newlines which has a value of None (if no newlines have
> >> yet been seen), '\n', '\r', '\r\n', or a tuple containing all the
> >> newline types seen.
> >>
> >> Python enforces that the mode, after stripping 'U', begins with 'r', 'w'
> >> or 'a'.
> >> ""
> >>
> >> which does, in fact indicate that 'Ub' is NOT allowed. We should be
> >> using 'Ur', I think. Maybe the "python enforces" is what we saw the
> >> error from -- it didn't used to enforce anything.
> >>
> >
> > 'rbU' works and I put that in as a quick fix.
> >>
> >> On 4/5/11 7:12 AM, Charles R Harris wrote:
> >>
> >> > The 'Ub' mode doesn't work for '\r' on python 3. This may be a bug in
> >> > python, as it works just fine on python 2.7.
> >>
> >> "Ub" never made any sense anywhere -- "U" means universal newline text
> >> file. "b" means binary -- combining them makes no sense. On older
> >> pythons, the behaviour of 'Ub' was undefined -- now, it looks like it is
> >> supposed to raise an error.
> >>
> >> does 'Ur' work with \r line endings on Python 3?
> >
> > Yes.
> >
> >>
> >> According to my read of the docs, 'U' does nothing -- "universal"
> >> newline support is supposed to be the default:
> >>
> >> """
> >> On input, if newline is None, universal newlines mode is enabled. Lines
> >> in the input can end in '\n', '\r', or '\r\n', and these are translated
> >> into '\n' before being returned to the caller.
> >> """
> >>
> >> > It may indeed be desirable
> >> > to read the files as text, but that would require more work on both
> >> > loadtxt and genfromtxt.
> >>
> >> Why can't we just open the file with mode 'Ur'? text is text, messing
> >> with line endings shouldn't hurt anything, and it might help.
> >>
> >
> > Well, text in the files then gets the numpy 'U' type instead of 'S', and
> > there are places where byte streams are assumed for stripping and such.
> > Which is to say that changing to text mode requires some work. Another
> > possibility is to use a generator:
> >
> > def usetext(fname):
> >     f = open(fname, 'rt')
> >     for l in f:
> >        yield asbytes(f.next())
> >
> > I think genfromtxt could use a refactoring and cleanup, but probably not
> for
> > 1.6.
>
> I think it should also be possible to read "rb" and strip any \r, \r\n
> in _iotools.py,
> that's were the bytes are used, from my reading and the initial error
> message.
>
>
Doesn't work for \r, you get the whole file at once instead of line by line.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110405/53d3f47b/attachment.html>