Extracting values from text file
Mirco Wahab
wahab at chemie.uni-halle.de
Sat Jun 17 08:20:44 EDT 2006
Thus spoke Mirco Wahab (on 2006-06-16 21:21):
> I used your example just to try that in python
> (i have to improve my python skills), but waved
> the white flag after realizing that there's no
> easy string/var-into-string interpolation.
I did another try on it, using all my Python
resources available (and several cups of coffee)
;-)
This scans your text for rules provided
and extracts values and variable names
and prints them at the end.
I had some issues with python then:
- no comment # after line continuation \\
- regular expressions **** **** (as I said before)
==>
DATA = '''
An example text file:
-----------
Some text that can span some lines.
Apples 34
56 Ducks
Some more text.
0.5 g butter
-----------------''' # data must show up before usage
filter = [ # define filter table
'Apples (apples)',
'(ducks) Ducks',
'(butter) g butter',
]
varname = {} # variable names to be found in filter
varscanner = r'\\b(\S+?)\\b' # expression used to extract values
example = DATA # read the appended example text,
import re
for rule in filter: # iterate over filter rules, rules will be in 'rule'
k = re.search(r'\((.+)\)', rule) # pull out variable names ->k
if k.group(1): # pull their values from text
varname[k.group(1)] = \
re.search( re.sub(r'\((.+)\)', varscanner, rule), \
example ).group(1) # use regex in modified 'rule'
for key, val in varname.items(): print key, "\t= ", val # print what's found
<==
I think, the source is quite comprehensible
in Python, as is in Perl - if there weren't
'regex issues' ;-)
Maybe some folks could have a look at it
and convert it to contemporary Python
Below ist the Perl program that was modified
to correspond roughly 1:1 to the Python
source above.
Both will print:
butter = 0.5
apples = 34
ducks = 56
Regards & thanks in advance
Mirco
==>
#/usr/bin/perl
use strict;
use warnings;
my @filter = ( # define filter table
'Apples (apples)',
'(ducks) Ducks',
'(butter) g butter',
);
my ($v, %varname) = ( '', () ); # variable names to be found in filter
my $varscanner = qr{\b(\S+?)\b}; # expression used to extract values
my $example = do { local$/; <DATA> }; # read the appended example text,
# change <DATA> to <> for std input
for (@filter) { # iterate over filter rules, rule line will be implicit ($_)
$v = $1 if s/\((.+)\)/$varscanner/; # pull out variable names ->$1
$varname{$v} = $1 if $example =~ /$_/; # pull their values from text
} # by using modified regex rule $_
print map { "$_\t= $varname{$_}\n"; } keys %varname; # print what's found
__DATA__
An example text file:
-----------
Some text that can span some lines.
Apples 34
56 Ducks
Some more text.
0.5 g butter
-----------------
<==
More information about the Python-list
mailing list