Invalid format character in string

Robin Munn rmunn at pobox.com
Fri Apr 4 19:04:41 EST 2003


I rearranged your text to put the response AFTER what you're responding
to, which is the natural reading order. Please don't top-post.

Nikolai Kirsebom <nikolai.NOJUNK at micon.no> wrote:
> On Fri, 4 Apr 2003 15:59:48 -0600, Skip Montanaro <skip at pobox.com>
> wrote:
>>
>>    Nikolai> ... since the text is actually a HTML file, it may contain a
>>    Nikolai> lot of characters with % (typically cell width definitions of
>>    Nikolai> tables).  If these are contained in the file, the string format
>>    Nikolai> statement complains about illegal format character.  
>>
>>Just double any literal % characters, e.g.:
>>
>>    <table width="%(width)s%%">
>>
>>Skip
> 
> The problems arise where the HTML editor used (to produce the file in
> the first place) has put various expressions of %; and %>.  These are
> HTML codings the creator of the HTML file does not really know about.
> As an example, I've just told them (the users creating the HTML file)
> that they should write the text "%(name)s" where they want the name
> generated and "%(age)s" where they want the age generated.  Then my
> script is run for a set of person objects producing a set of HTML
> files with the correct names and corresponding ages. 

If the original HTML file is not under your control, then what you
probably want to do is run it through a search-and-replace to double any
% characters that are meant to be taken literally. I.e.,

    import re

    html_str = '... %(width) wide, 50% off this week only! ...'
    quoted_str = re.sub(r'%(?!\()', '%%', html_str)
    print quoted_str

This would produce:

'... %(width) wide, 50%% off this week only! ...'

Notice the use of a negative look-ahead assertion (the '(?!...)' syntax)
to only match % characters that aren't followed by an open parenthesis.

By the way, please don't top-post like that; it makes multiple-person
conversations very hard to read. See http://www.caliburn.nl/topposting.html
for more reasons why bottom-posting is the accepted style.

Hope this helps!

-- 
Robin Munn <rmunn at pobox.com>
http://www.rmunn.com/
PGP key ID: 0x6AFB6838    50FF 2478 CFFB 081A 8338  54F7 845D ACFD 6AFB 6838




More information about the Python-list mailing list