File processing - is Python suitable?

Peter Otten __peter__ at web.de
Tue Jun 19 09:54:46 EDT 2007


ferrad wrote:

> I have not used Python before, but believe it may be what I need.
> 
> I have large text files containing text, numbers, and junk.  I want to
> delete large chunks process other bits, etc, much like I'd do in an
> editor, but want to do it automatically.  I have a set of generic
> rules that my fingers follow to process these files, which all follow
> a similar template.
> 
> Question: can I translate these types of rules into programmatical
> constructs that Python can use to process these files?  Can Python do
> the trick?

Yes, and if you are a non-programmer, the entry barrier for Python is as low
as it can get. However, what a programming language treats as a rule is
much stricter than what a human being might expect. For example, appending
an 's' to the first word in a sentence is "easy" in Python, changing the
subject's numerus to plural is "hard". Both are doable, but the less
technical your rules are the harder they become to translate.

You often have to compromise either by proofreading the results of any
automated processing, or by having your program ask a human operator in the
cases it can't decide upon.

I recommend that you play around a bit in the interactive interpreter to get
a feel for the kind of operations that are easily available on strings.

Then write the processing rules into a script, and always start your
conversion from the original data (of which you you have a backup in some
locker), not some intermediate output. That way you can try processing
without losing information in the data or about the process -- until you
find the results acceptable. Make backups of your script, too, before you
are trying something new.

Peter



More information about the Python-list mailing list