splitting a long string into a list

Cameron Walsh cameron.walsh at gmail.com
Tue Nov 28 02:53:11 EST 2006


ronrsr wrote:
> still having a heckuva time with this.
> 
> here's where it stand - the split function doesn't seem to work the way
> i expect it to.
> 
> 
> longkw1,type(longkw):   Agricultural subsidies; Foreign
> aid;Agriculture; Sustainable Agriculture - Support; Organic
> Agriculture; Pesticides, US, Childhood Development, Birth Defects;
> <type 'list'> 1
> 
> longkw.replace(',',';')
> 
> Agricultural subsidies; Foreign aid;Agriculture; Sustainable
> Agriculture - Support; Organic Agriculture; Pesticides, US, Childhood
> Development

Here you have discovered that string.replace() returns a string and does
NOT modify the original string.  Try this for clarification:

>>> a="DAWWIJFWA,dwadw;djwkajdw"
>>> a
'DAWWIJFWA,,,,,,dwadw;djwkajdw'
>>> a.replace(",",";")
'DAWWIJFWA;;;;;;dwadw;djwkajdw'
>>> a
'DAWWIJFWA,,,,,,dwadw;djwkajdw'
>>> b = a.replace(',',';')
>>> b
'DAWWIJFWA;;;;;;dwadw;djwkajdw'



> 
> 
>  kw = longkw.split("; ,")    #kw is now a list of len 1

Yes, because it is trying to split longkw wherever it finds the whole
string "; '" and NOT wherever it finds ";" or " " or ",".  This has been
stated before by NickV, Duncan Booth, Fredrik Lundh and Paul McGuire
amongst others. You will need to do either:

a.)

# First split on every semicolon
a = longkw.split(";")
b = []
# Then split those results on whitespace
#(the default action for string.split())
for item in a:
  b.append(item.split())
# Then split on commas
kw = []
for item in b:
kw.append(item.split(","))

or b.)

# First replace commas with spaces
longkw = longkw.replace(",", " ")
# Then replace semicolons with spaces
longkw = longkw.replace(";", " ")
# Then split on white space, (default args)
kw = longkw.split()


Note that we did:
longkw = longkw.replace(",", " ")
and not just:
longkw.replace(",", " ")


You will find that method A may give empty strings as some elements of
kw.  If so, use method b.


Finally, if you have further problems, please please do the following:

1.)  Provide your input data clearly, exactly as you have it.
2.)  Show exactly what you want the output to be, including any special
cases.
3.)  If something doesn't work the way you expect it to, tell us how you
expect it to work so we know what you mean by "doesn't work how I expect
it to"
4.)  Read all the replies carefully and if you don't understand the
reply, ask for clarification.
5.)  Read the help functions carefully - what the input parameters have
to be and what the return value will be, and whether or not it changes
the parameters or original object.  Strings are usually NOT mutable so
any functions that operate on strings tend to return the result as a new
string and leave the original string intact.

I really hope this helps,

Cameron.



More information about the Python-list mailing list