NEWBIE: Tokenize command output

Tim Chase python.list at tim.thechases.com
Fri May 12 10:28:28 EDT 2006


> I reeducated my fingers after having troubles with huge files !-)

I'll keep it in mind...the prospect of future trouble with 
large files is a good kick-in-the-pants to remember.

>>Otherwise, just to be informed, what advantage does rstrip() have over
>>[:-1] (if the two cases are considered uneventfully the same)?
> 
> 1/ if your line doesn't end with a newline, line[:-1] will still remove
> the last caracter.

Good catch.  Most *nix editors are smart about having a 
trailing NL character at the end of the file, but some 
Windows text-editors aren't so kind.

> 2/ IIRC, if you don't use universal newline and the file uses the
> DOS/Windows newline convention, line[:-1] will not remove the CR - only
> the LF (please someone correct me if I'm wrong here).

To get this behavior, I think you have to open the file in 
binary mode.  To me, opening as binary is a signal that I 
should be using read() rather than readlines() (or 
xreadlines, or the iterator, or whatever).  If you've opened 
in binary mode, you might have to use rstrip("\r\n") to get 
both possible line-ending characters.

> I know this may not be a real issue in the actual case, but using
> rstrip() is still a safer way to go IMHO - think about using this same
> code to iterate over a list of strings without newlines...

Makes sense.  Using rstrip("\r\n") has all the benefits, 
plus more gracefully handles cases where a newline might not 
be present or comprised of two (or more) characters.  Got 
it!  Thanks for the explanation.

-tkc






More information about the Python-list mailing list