Is Python Suitable for Large Find & Replace Operations?

Paul Rubin http
Mon Jun 13 14:28:42 EDT 2005


rbt <rbt at athop1.ath.vt.edu> writes:
> Now, management would like the IT guys to go thru the old data and
> replace as many SSNs with the new ID numbers as possible. You have a
> tab delimited txt file that maps the SSNs to the new ID numbers. There
> are 500,000 of these number pairs. What is the most efficient way  to
> approach this? I have done small-scale find and replace programs
> before, but the scale of this is larger than what I'm accustomed to.

Just use an ordinary python dict for the map, on a system with enough
ram (maybe 100 bytes or so for each pair, so the map would be 50 MB).
Then it's simply a matter of scanning through the input files to find
SSN's and look them up in the dict.



More information about the Python-list mailing list