Extracting values from text file

Mirco Wahab wahab at chemie.uni-halle.de
Sat Jun 17 08:20:44 EDT 2006


Thus spoke Mirco Wahab (on 2006-06-16 21:21):

> I used your example just to try that in python
> (i have to improve my python skills), but waved
> the white flag after realizing that there's no
> easy string/var-into-string interpolation.

I did another try on it, using all my Python
resources available (and several cups of coffee)
;-)

This scans your text for rules provided
and extracts values and variable names
and prints them at the end.

I had some issues with python then:
- no comment # after line continuation \\
- regular expressions **** **** (as I said before)

==>

DATA = '''
An example text file:
-----------
Some text that can span some lines.
  Apples 34
  56 Ducks
Some more text.
  0.5 g butter
-----------------'''       # data must show up before usage

filter = [                 # define filter table
   'Apples (apples)',
   '(ducks) Ducks',
   '(butter) g butter',
]
varname = {}                            # variable names to be found in filter
varscanner = r'\\b(\S+?)\\b'            # expression used to extract values
example = DATA                          # read the appended example text,

import re
for rule in filter: # iterate over filter rules, rules will be in 'rule'
       k = re.search(r'\((.+)\)', rule) # pull out variable names ->k
       if k.group(1):                   # pull their values from text
          varname[k.group(1)] = \
               re.search( re.sub(r'\((.+)\)', varscanner, rule), \
                          example ).group(1)  # use regex in modified 'rule'

for key, val in varname.items(): print key, "\t= ", val # print what's found

<==

I think, the source is quite comprehensible
in Python, as is in Perl - if there weren't
'regex issues' ;-)

Maybe some folks could have a look at it
and convert it to contemporary Python

Below ist the Perl program that was modified
to correspond roughly 1:1 to the Python
source above.

Both will print:
   butter 	=  0.5
   apples 	=  34
   ducks 	=  56

Regards & thanks in advance

Mirco

==>

#/usr/bin/perl
use strict;
use warnings;

my @filter = (                 # define filter table
     'Apples (apples)',
     '(ducks) Ducks',
     '(butter) g butter',
);

my ($v, %varname) = ( '', () );       # variable names to be found in filter
my $varscanner = qr{\b(\S+?)\b};      # expression used to extract values
my $example = do { local$/; <DATA> }; # read the appended example text,
                                      # change <DATA> to <> for std input

for (@filter) { # iterate over filter rules, rule line will be implicit ($_)
    $v = $1 if s/\((.+)\)/$varscanner/;    # pull out variable names ->$1
    $varname{$v} = $1 if $example =~ /$_/; # pull their values from text
}                                          # by using modified regex rule $_

print map { "$_\t= $varname{$_}\n"; } keys %varname; # print what's found

__DATA__
An example text file:
-----------
Some text that can span some lines.
  Apples 34
  56 Ducks
Some more text.
  0.5 g butter
-----------------

<==



More information about the Python-list mailing list