Regular Expression Help

John Machin sjmachin at lexicon.net
Tue Feb 26 15:29:59 EST 2008


On Feb 27, 6:28 am, Lytho... at gmail.com wrote:
> Hi All,
>
> I have a python utility which helps to generate an excel file for
> language translation. For any new language, we will generate the excel
> file which will have the English text and column for interested
> translation language. The translator  will provide the language string
> and again I will have python utility to read the excel file target
> language string and update/generate the resource file & database
> records. Our application is VC++ application, we use MS Access db.
>
> We have string table like this.
>
> "STRINGTABLE
> BEGIN
>     IDS_CONTEXT_API_ "API Totalizer Control Dialog"
>     IDS_CONTEXT         "Gas Analyzer"
> END
>
> STRINGTABLE
> BEGIN
>     ID_APITOTALIZER_CONTROL
>                             "Start, stop, and reset API volume flow
> \nTotalizer Control"
> END
> "
> this repeats.....
>
> I read the file line by line and pick the contents inside the
> STRINGTABLE.
>
> I want to use the regular expression while should give me all the
> entries with in
> STRINGTABLE
> BEGIN
> <<Get what ever put in this>>
> END
>
> I tried little bit, but no luck. Note that it is multi-line string
> entries which we cannot make as single line
>

Looks to me like you have a very simple grammar:
entry ::= id quoted_string

id is matched by r'[A-Z]+[A-Z_]+'
quoted_string is matched by r'"[^"]*"'

So a pattern which will pick out one entry would be something like
    r'([A-Z]+[A-Z_]+)\s+("[^"]*")'
Not that using \s+ (whitespace) allows for having \n etc between id
and quoted_string.

You need to build a string containing all the lines between BEGIN and
END, and then use re.findall.

If you still can't get it to work, ask again -- but do show the code
from your best attempt, and reduce ambiguity by showing your test
input as a Python expression e.g.
test1_in = """\
    ID_F "fough"
    ID_B_
        "barre"
    ID__Z
        "zotte start
                      zotte end"
"""



More information about the Python-list mailing list