[Tutor] Mapping ID's for corresponding values in different Columns

Hugo Arts hugo.yoshi at gmail.com
Mon Jul 9 09:42:33 CEST 2012


On Sun, Jul 8, 2012 at 11:47 PM, Fred G <bayespokerguy at gmail.com> wrote:

> Hi--
>
> My current input looks like the following:
>
> FILE1.csv
> PERSON_ID    PERSON_NAME
> 1                     Jen
> 2                     Mike
> 3                     Jim
> 4
> 5                     Jane
> 6                     Joe
> 7                     Jake
>
> FILE2.csv
> PERSON_ID   PERSON_NAME
>                      Jim
>                      Mike
>                      Jane
>                      Todd
>                      Jen
>
> _________
> I want to fill into the PERSON_ID column of FILE2.csv the corresponding
> ID's associated with those names as identified in FILE1.csv.
>
> At first I imported the csv module and was using the csv.Reader, but then
> it seemed simple enough just to write something like:
> for line in file2:
>      print(line)
>
> giving me the following output:
> PERSON_ID, PERSON_NAME
> , Jim
> , Mike
> , Jane
> , Todd
> , Jen
>
> I think I understand the issue at a conceptual level, but not quite sure
> how to fully implement it:
> a) I want to build a dictionary to create keys, such that each number in
> file1 corresponds to a unique string in column B of file1.
> b) then write a for loop like the following:
> for "person_name" in file2:
>    if "person_name.file2" == "person_name.file1":
>        person_id.file2 == person_id.file1
> c) write into file2 the changes to person_id's...
>
> But it's pretty difficult for me to get past this stage. Am I on the right
> track? And more importantly, how could I learn how to actually implement
> this in smaller stages?
>
>
 You're on the right track, and you're almost there! You've already broken
down the problem into steps. You should now try to implement a function for
each step, and finally you should glue these functions together into a
final program.

a) Though you don't *have* to use it, csv.reader is really quite simple,
I'd recommend it. Try and write a function that takes a file name as
argument and returns a dictionary of the form { name: id, name: id } (i.e.
the names are the keys).

b) For this step, you first need a list of all names in file 2. You could
use csv.reader again or you could just parse it. Then, you use the
dictionary to look up the corresponding id. The end goal for this function
is to return a list of lists that looks much like the file you want to end
up with:

[[id, name],
 [id, name],
 [id, name]]

c) this step should now be easy. I'd again, recommend csv.writer, it makes
the process pretty simple. You just pass in the nested list from step (b)
and you're pretty much done.

For tips on the csv module, the list of examples is pretty helpful:
http://docs.python.org/py3k/library/csv.html#examples
If you need help constructing the lists and dictionaries, my tips would be
1) think one row at a time, 2) the for loop is your best friend, and 3)
nested lists usually means nested loops

HTH,
Hugo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120709/26ac5f19/attachment.html>


More information about the Tutor mailing list