[Tutor] Suggestions as to how to read a file in paragraphs

Alan Gauld alan.gauld at blueyonder.co.uk
Wed Sep 1 23:59:00 CEST 2004


> Depending on the biological sequence files, the paragraph separator
could
> be a "//" or newlines or any string really. I know in perl there is
an
> input record separator which by default is the newline and one can
specify
> this to a specific delimiter. Is there one in python?

OK, awk can do that trick too. Unfortunately I don't know of any
way to do it in Python, I think you either need to check each line
in a for/readline() loop or use a regex to extract the paras with
findall() - which brings more memory issues with it... or use
multiple applications of string.split() on the file.read() string.

No simple answers for big files I'm afraid, and I checked the
fileinput module and it has no way of defining a split either,
and is slow anyway if speed is already an issue.

Alan G.



More information about the Tutor mailing list