Record seperator

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Aug 27 13:24:34 EDT 2011


greymaus wrote:

> On 2011-08-26, D'Arcy J.M. Cain <darcy at druid.net> wrote:
>> On 26 Aug 2011 18:39:07 GMT
>> greymaus <greymausg at mail.com> wrote:
>>> 
>>> Is there an equivelent for the AWK RS in Python?
>>> 
>>> 
>>> as in RS='\n\n'
>>> will seperate a file at two blank line intervals
>>
>> open("file.txt").read().split("\n\n")
>>
> 
> 
> Ta!.. bit awkard. :))))))

Er, is that meant to be a pun? "Awk[w]ard", as in awk-ward?

In any case, no, the Python line might be a handful of characters longer
than the AWK equivalent, but it isn't awkward. It is logical and easy to
understand. It's embarrassingly easy to describe what it does:

open("file.txt")   # opens the file
 .read()           # reads the contents of the file
 .split("\n\n")    # splits the text on double-newlines.

The only tricky part is knowing that \n means newline, but anyone familiar
with C, Perl, AWK etc. should know that.

The Python code might be "long" (but only by the standards of AWK, which can
be painfully concise), but it is simple, obvious and readable. A few extra
characters is the price you pay for making your language readable. At the
cost of a few extra key presses, you get something that you will be able to
understand in 10 years time.

AWK is a specialist text processing language. Python is a general scripting
and programming language. They have different values: AWK values short,
concise code, Python is willing to pay a little more in source code.


-- 
Steven




More information about the Python-list mailing list