Question on Python Split

MRAB python at mrabarnett.plus.com
Sun Oct 7 16:01:07 EDT 2012


On 2012-10-07 20:30, subhabangalore at gmail.com wrote:
> Dear Group,
>
> Suppose I have a string as,
>
> "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am terming it as,
>
> str1= "Project Gutenberg has 36000 free ebooks for Kindle Android iPad iPhone."
>
> I am working now with a split function,
>
> str_words=str1.split()
> so, I would get the result as,
> ['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 'Kindle', 'Android', 'iPad', 'iPhone.']
>
> But I am looking for,
>
> ['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android iPad', 'iPhone']
>
> This can be done if we assign the string as,
>
> str1= "Project Gutenberg, has 36000, free ebooks, for Kindle, Android iPad, iPhone,"
>
> and then assign the split statement as,
>
> str1_word=str1.split(",")
>
> would produce,
>
> ['Project Gutenberg', ' has 36000', ' free ebooks', ' for Kindle', ' Android iPad', ' iPhone', '']
>
It can also be done like this:

 >>> str1 = "Project Gutenberg has 36000 free ebooks for Kindle Android 
iPad iPhone."
 >>> # Splitting into words:
 >>> s = str1.split()
 >>> s
['Project', 'Gutenberg', 'has', '36000', 'free', 'ebooks', 'for', 
'Kindle', 'Android', 'iPad', 'iPhone.']
 >>> # Using slicing with a stride of 2 gives:
 >>> s[0 : : 2]
['Project', 'has', 'free', 'for', 'Android', 'iPhone.']
 >>> # Similarly for the other words gives:
 >>> s[1 : : 2]
['Gutenberg', '36000', 'ebooks', 'Kindle', 'iPad']
 >>> # Combining them in pairs, and adding an extra empty string in case 
there's an odd number of words:
 >>> [(x + ' ' + y).rstrip() for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
['Project Gutenberg', 'has 36000', 'free ebooks', 'for Kindle', 'Android 
iPad', 'iPhone.']

> My objective generally is achieved, but I want to convert each group here in tuple so that it can be embedded, like,
>
> [(Project Gutenberg), (has 36000), (free ebooks), (for Kindle), ( Android iPad), (iPhone), '']
>
> as I see if I assign it as
>
> for i in str1_word:
>         print i
>         ti=tuple(i)
>         print ti
>
> I am not getting the desired result.
>
> If I work again from tuple point, I get it as,
>>>> tup1=('Project Gutenberg')
>>>> tup2=('has 36000')
>>>> tup3=('free ebooks')
>>>> tup4=('for Kindle')
>>>> tup5=('Android iPad')
>>>> tup6=tup1+tup2+tup3+tup4+tup5
>>>> print tup6
> Project Gutenberghas 36000free ebooksfor KindleAndroid iPad
>
It's the comma that makes the tuple, not the parentheses, except for the 
empty tuple which is just empty parentheses, i.e. ().

> Then how may I achieve it? If any one of the learned members can kindly guide me.

 >>> [((x + ' ' + y).rstrip(), ) for x, y in zip(s[0 : : 2], s[1 : : 2] 
+ [''])]
[('Project Gutenberg',), ('has 36000',), ('free ebooks',), ('for 
Kindle',), ('Android iPad',), ('iPhone.',)]

Is this what you want?

If you want it to be a list of pairs of words, then:

 >>> [(x, y) for x, y in zip(s[0 : : 2], s[1 : : 2] + [''])]
[('Project', 'Gutenberg'), ('has', '36000'), ('free', 'ebooks'), ('for', 
'Kindle'), ('Android', 'iPad'), ('iPhone.', '')]




More information about the Python-list mailing list