Text Parsing - character at a time...

John Lenton jlenton at gmail.com
Sat Jul 10 11:45:02 EDT 2004


On 9 Jul 2004 04:46:29 -0700, Fuzzyman <michael at foord.net> wrote:
> I want to parse some text and generate an output that is similar but
> not identical to the input.
> 
> The string I produce will be of similar length to the input string -
> but a bit longer.
> 
> I'm parsing character by character and adding the characters of the
> input string to the output until I come to ones I want to modify. This
> means creating a new string for every character (since strings are
> immutable) which seems very inneficient - particularly when I know
> roughly what the output length will be. In a language like c I think I
> could reserve a chunk of memory and keep a track of how much I'd
> filled... just putting characters into it.(If I filled it I could
> reserve a smaller chunk more - not difficult to keep a track of).
> What's an efficient equivalent in python ? I could use a list,
> appending characters onto the end of it.. converting to a string at
> the end using ''.join(thelist).

I'm not terribly clear on what you're trying to do, but I'm pretty
sure you can do it with regular expressions a lot easyer than the way
you're describing it; you might not even need that---you might get
away with the 'replace' method on strings. Which you use depends on
the complexity of what you want to do, and on which ends up being
faster on your machine; as soon as its more complicated than one or
two 'replace's, regular expressions usually win.

If you could describe (a subset of) the problem in a bit more detail,
you'll probably get more useful suggestions (as in, code to do it, or
even docs to read to do it).

-- 
John Lenton (jlenton at gmail.com) -- Random fortune:
bash: fortune: command not found



More information about the Python-list mailing list