Asking for advice: Using Python for data validation

Alex Martelli aleax at aleax.it
Tue Sep 11 08:57:36 EDT 2001


<juan.alcolea at bt.es> wrote in message
news:mailman.1000207993.9968.python-list at python.org...
    ...
"""
 I'm thinking about using python to code a set of scripts that perform some
data validation (format, completeness) of the files *before* they are sent
to us, so any error is detected as close to the source as possible (and as
far from us as possible ;-) in order to minimize this bad-data time waste.
"""
OK, good general problem statement.


"""
- Do you think that Python is a good choice for this task? Please note that
the scripts must run in very differente platforms (NT, *nix, maybe Mac...).
I'm fairly new to Python, and although I'm impressed with it, I'm not sure
about it being really and easily portable unless you're a C & OS guru...
"""
It's well portable among Win32, Unixlike, and maybe Mac, at least, if
you just watch out for a few gotcha's (time.strptime comes to mind: it's
unfortunately NOT around on Win32 Python!!!).  But any non-cross-portable
aspects would easily emerge when you run halfway-decent test, anyway,
and they're easy to fix.


"""
- Is there any module or library specially designed for this kind of task?
(parsing text data files with fixed or variable length fields, validating
date formats, etc...)
"""
Not a single module or library, as far as I know.  Built-in objects
(such as strings and file objects) and modules (particularly regular
expressions) take you most of the way, and it's not hard to find 3rd
party modules for the rest of the tast -- for date/time parsing, in
particular, I recommend eGenix's "mxDateTime" module.


Alex






More information about the Python-list mailing list