Fuzzy matching of postal addresses

McBooCzech petr at tpc.cz
Sat Feb 19 21:16:59 EST 2005


Sorry for my "Ferbl typo". For the local anti-smoking campaign I am
trying to link some addresses which contain following "linkable"
informations (data fields) only:

RECORD_ID, Street + No., City, Post code,

All data are now w/o Unicode characters. Do you think it possible to
try to link it with Febrl w/o deep code modification?

I did try to link our data but the result is just a plenty of warning
messages but no links. What is your suggestion? Please understand I do
not want to bother you with my questions. I am just asking you your
comments or pointers before I will try to dig in to the code. You
probably know some "tricks" in data organization or something like
that, which can be much easier then code digging.

I can send our CSVs to you (they are small, just about 3204 records in
the A data-set and about 1241 records in the B data-set) and a log as
well.

I have tried to organized oru files as following:
       FEBRL reqirements : Our data
       ==============================
                 'rec_id': RECORD_ID,
             'given_name': ""
                'surname': ""
             'street_num': ""
         'address_part_1': ""
         'address_part_2': Street + No.
                 'suburb': City
               'postcode': Post code
                  'state': ""
          'date_of_birth': ""
             'soc_sec_id': ""

Thanks for your answer and suggestions

Petr




More information about the Python-list mailing list