Reading plain text file database tables

Stephan Houben stephan at pcrm.win.tue.nl
Thu Sep 9 04:30:11 EDT 1999


On Thu, 09 Sep 1999 14:15:38 +0800, Li Dongfeng 
<mavip5 at inet.polyu.edu.hk> wrote:

>
>Do we have a module to read plain text file
>database tables?
>
>All the data management software, e.g. excel,
>dBase, SAS, etc., support input/output a table 
>from/to a plain text file, fields can be separated
>by their column position, by spaces, by tabs,
>by commas, etc.
>
>How can we read this kind of file into a matrix like
>structure(list of lists)? I have written one reading
>files with fixed-width fields. For delimited files,
>simply using string.split work most of the time, but
>fails reading lines like
>
>  "Peter Thomson"  25  36
>
>or even 
>
>  "Peter\" Thomson" 25 36
>
>I think this is a common task, so maybe someone has
>already given a very good solution.

Use regular expressions. 
A regular expression that matches strings within "..",
while escaping " with \, is:

r = re.compile(r'"([^"\\]|(\\.))*"')

Extend this to get what you want.
If you're very worried about speed, create
a lexer using (f)lex, and then call the generated
C code from python.

Greetings,

Stephan




More information about the Python-list mailing list