[Tutor] Mapping ID's for corresponding values in different Columns

Fred G bayespokerguy at gmail.com
Mon Jul 9 17:23:44 CEST 2012


Thank you guys so much.  I'm quite close now, but I'm having a bit of
trouble on the final for-loop to create the new dictionary.  I have the
following 3 functions (note I'm re-typing it from a different computer so
while the identation will be off here, it is correct in the actual code):

#read in file as dictionary.  Name is key, id is value.
def csv_to_dict(filename):
  record = {}
  line = filename.readline()
  for line in filename:
    key = line.split(",", 1)[-1]
    val = line.split(",", 1)[:-1]
  return record

#read in file2 names
def nested_line(filename):
  line = filename.readline()
  new_list = []
  for line in filename:
    name = line.split(",", 1)[-1]
    new_list.append(name)
  return new_list

#this is the function that I'm having trouble with
#create new dict mapping file 1 ids to file2's names
def new_dict (csv_to_dict, nested_line):
  old_dict = cssv_to_dict(file1)
  old_list = nested_line(file2)
  new_dict1 = {}
  for item in old_list:
    new_dict1[item] = item
  return (new_dict1)

I have tried various permutations of the for loop in this final function,
but I haven't quite gotten it.  The current form is the closest I have
gotten to my desired output, since it produces the new dictionary with the
name that I want-- just an invalid id associated with that name.  I tried a
bunch of different nested loops but kept getting it incorrect and then
after so many attempts I got a little confused about what I was trying to
do.  So I backed up a little bit and have this.

Conceptually, I thought that this would give me my desired result:
new_dict
for name in old_list:
  for key, value in old_dict:
    if name == key:
      new_dict1[key] = value
return(new_dict1)

But it wasn't right.  I tried a different approach where I used the
dict.values() function in order to pull out the values from old_dict that
we want to include in the new_dict, but I got a bit lost there, too.

I'm so close right now and I would be so thankful for any bit of
clarification which could get me to the finish line.

On Mon, Jul 9, 2012 at 12:42 AM, Hugo Arts <hugo.yoshi at gmail.com> wrote:

> On Sun, Jul 8, 2012 at 11:47 PM, Fred G <bayespokerguy at gmail.com> wrote:
>
>> Hi--
>>
>> My current input looks like the following:
>>
>> FILE1.csv
>> PERSON_ID    PERSON_NAME
>> 1                     Jen
>> 2                     Mike
>> 3                     Jim
>> 4
>> 5                     Jane
>> 6                     Joe
>> 7                     Jake
>>
>> FILE2.csv
>> PERSON_ID   PERSON_NAME
>>                      Jim
>>                      Mike
>>                      Jane
>>                      Todd
>>                      Jen
>>
>> _________
>> I want to fill into the PERSON_ID column of FILE2.csv the corresponding
>> ID's associated with those names as identified in FILE1.csv.
>>
>> At first I imported the csv module and was using the csv.Reader, but then
>> it seemed simple enough just to write something like:
>> for line in file2:
>>      print(line)
>>
>> giving me the following output:
>> PERSON_ID, PERSON_NAME
>> , Jim
>> , Mike
>> , Jane
>> , Todd
>> , Jen
>>
>> I think I understand the issue at a conceptual level, but not quite sure
>> how to fully implement it:
>> a) I want to build a dictionary to create keys, such that each number in
>> file1 corresponds to a unique string in column B of file1.
>> b) then write a for loop like the following:
>> for "person_name" in file2:
>>    if "person_name.file2" == "person_name.file1":
>>        person_id.file2 == person_id.file1
>> c) write into file2 the changes to person_id's...
>>
>> But it's pretty difficult for me to get past this stage. Am I on the
>> right track? And more importantly, how could I learn how to actually
>> implement this in smaller stages?
>>
>>
>  You're on the right track, and you're almost there! You've already broken
> down the problem into steps. You should now try to implement a function for
> each step, and finally you should glue these functions together into a
> final program.
>
> a) Though you don't *have* to use it, csv.reader is really quite simple,
> I'd recommend it. Try and write a function that takes a file name as
> argument and returns a dictionary of the form { name: id, name: id } (i.e.
> the names are the keys).
>
> b) For this step, you first need a list of all names in file 2. You could
> use csv.reader again or you could just parse it. Then, you use the
> dictionary to look up the corresponding id. The end goal for this function
> is to return a list of lists that looks much like the file you want to end
> up with:
>
> [[id, name],
>  [id, name],
>  [id, name]]
>
> c) this step should now be easy. I'd again, recommend csv.writer, it makes
> the process pretty simple. You just pass in the nested list from step (b)
> and you're pretty much done.
>
> For tips on the csv module, the list of examples is pretty helpful:
> http://docs.python.org/py3k/library/csv.html#examples
> If you need help constructing the lists and dictionaries, my tips would be
> 1) think one row at a time, 2) the for loop is your best friend, and 3)
> nested lists usually means nested loops
>
> HTH,
> Hugo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20120709/9884d4ee/attachment-0001.html>


More information about the Tutor mailing list