how to fast processing one million strings to remove quotes

Tim Daneliuk info at tundraware.com
Thu Aug 3 17:43:09 EDT 2017


On 08/02/2017 10:05 AM, Daiyue Weng wrote:
> Hi, I am trying to removing extra quotes from a large set of strings (a
> list of strings), so for each original string, it looks like,
> 
> """str_value1"",""str_value2"",""str_value3"",1,""str_value4"""
> 
> 
> I like to remove the start and end quotes and extra pairs of quotes on each
> string value, so the result will look like,
> 
> "str_value1","str_value2","str_value3",1,"str_value4"

<SNIP>

This part can also be done fairly efficiently with sed:

time cat hugequote.txt | sed 's/"""/"/g;s/""/"/g' >/dev/null

real    0m2.660s
user    0m2.635s
sys     0m0.055s

hugequote.txt is a file with 1M copies of your test string above in it.

Run on a quad core i5 on FreeBSD 10.3-STABLE.



More information about the Python-list mailing list