[Tutor] handling string!!

Daniel Ehrenberg littledanehren at yahoo.com
Thu Oct 23 16:06:17 EDT 2003


> Not exactly.  If youre string is always in the way
> you described it,
> then there is an easy possibilty.
> 
> You could transform the string in a list with the
> `split()' function and
> use `"' as delimiter.
> 
> In [7]: s = 'ABC   value="123" value="345"'
> 
> In [8]: s.split('"')
> Out[8]: ['ABC   value=', '123', ' value=', '345',
> '']
> 
> 
> Now the only problem is to grab the first, third ...
> value from the
> list.
> 
> In [9]: sp = s.split('"')
> 
> In [10]: for i in range(1, len(sp), 2):
>    ....:     print sp[i]
>    ....:
> 123
> 345
> 
> 
> That's a bit ugly and in Python 2.3 with the
> itertools module we have a
> better alternative.
> 
> In [17]: from itertools import islice
> 
> In [18]: for i in islice(sp, 1, None, 2):
>    ....:     print i
>    ....:     
> 123
> 345
> 
> 
> 
>    Karl

I have a somewhat related question. I am trying to
write a program to parse the simple markup language
used at Wikipedia.org. For this specific question, the
markup is the same as in MoinMoin.

The first feature is bolding and italics. Since I'm
using XHTML, I'll use the <strong> and <em> tags
instead of <b> and <i>. Here are some examples of
correctly parsed text from this part of the markup
language:

'''bold''' -> <strong>bold</strong>
''italics'' -> <em>italics</em>
'''''bold and italics''''' -> <strong><em>bold and
italics</em></strong>
'''''b & i'' b''' -> <strong><em>b & i</em> b</strong>
'''''b & i''' i'' -> <em><strong>b & i</strong> i</em>

and so on. If the tags couldn't be mixed as they are
in the last example, the code would be relatively
simple:

>>> list2parse = initialstring.split("'''")
>>> state = True
>>> parsedlist = []
>>> for i in slice(list2parse, 1, None, 1):
...     if state:
...         parsedlist.append("<strong>" + i +
"</strong>")
...     else:
...         parsedlist.append(i)
...     state = not state

would parse the bold parts of the text. It would be
similar for the code processing italics and the
combination of bold and italics, doing the ones with
the most apostrophies first and the least apostrophies
last (ie. first bold and italics, then bold, then
italics). However, I don't see how I could do the same
with the forth and fifth examples. Could you help me
with that?
Daniel Ehrenberg

__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com



More information about the Tutor mailing list