Reading plain text file database tables

Gordon McMillan gmcm at hypernet.com
Thu Sep 9 08:26:32 EDT 1999


I think you'll find a regex that does this on Hans Nowaks snippets 
page, whereever that is at the moment (he just posted a notice that 
it had moved in the last few days).

Just found it at http://www.hvision.nl/~ivnowa/snippets/source/67.py

Li Dongfeng wrote:

> Thanks for the direction. I'm also
> trying to solve this with RE. But in order
> to extract "ab \" cd" like structure,
> your [^"\\] seems puzzling, because anything
> inside [] represent only one character, not
> the '\"' two character sequence. Any real
> worked out result?
> 
> Stephan Houben wrote:
> > 
> > On Thu, 09 Sep 1999 14:15:38 +0800, Li Dongfeng
> > <mavip5 at inet.polyu.edu.hk> wrote:
> > 
> > >
> > >Do we have a module to read plain text file
> > >database tables?
> > >
> > >All the data management software, e.g. excel,
> > >dBase, SAS, etc., support input/output a table
> > >from/to a plain text file, fields can be separated
> > >by their column position, by spaces, by tabs,
> > >by commas, etc.
> > >
> > >How can we read this kind of file into a matrix like
> > >structure(list of lists)? I have written one reading
> > >files with fixed-width fields. For delimited files,
> > >simply using string.split work most of the time, but
> > >fails reading lines like
> > >
> > >  "Peter Thomson"  25  36
> > >
> > >or even
> > >
> > >  "Peter\" Thomson" 25 36
> > >
> > >I think this is a common task, so maybe someone has
> > >already given a very good solution.
> > 
> > Use regular expressions.
> > A regular expression that matches strings within "..",
> > while escaping " with \, is:
> > 
> > r = re.compile(r'"([^"\\]|(\\.))*"')
> > 
> > Extend this to get what you want.
> > If you're very worried about speed, create
> > a lexer using (f)lex, and then call the generated
> > C code from python.
> > 
> > Greetings,
> > 
> > Stephan
> 
> -- 
> http://www.python.org/mailman/listinfo/python-list

- Gordon




More information about the Python-list mailing list