CAN You help Re: Writing dictionary output to a file

Ruud de Jong ruud.de.jong at consunet.nl
Sat Mar 6 06:36:12 EST 2004


First of all, please keep the discussion on comp.lang.python.
Others may benefit from this as well.

dont bother schreef:

>Hey Can you help me with that?
>My Problem is exactly here:
>
>You did not get me:
>heres what I have to ask:
>
>"How to preserve this attribute -- line
>number/position of the word in the dictionary."
>
I still don't understand the actual need for this. If the purpose is to 
provide
a frequency count of words in a message compared with words in a dictionary,
why would you need the actual *position* of the word in the dictionary?
Is it really imprtant if the match was with the 6th or the 27th entry?
And how do you handle addtions to the dictionary?
So my first advice would be for you to make sure that you really *need* the
position.

>For example: If I am comparing words in a message with
>the words in a dictionary, whenever there is a match,
>I want to record the corresponding location of the
>word in the dictionary.
>
I think you want to record the word, not it's position. Positions will 
change
when new words are added to a dictionary, especially if the dictionary is
kept in alphabetical sorted order.

>heres the code I am using for matching the words in
>the dictionary with the words in the message:
>
>for i in msg:
>     if i in dct:
>         try:
>             vector[i] += 1
>
>         except:
>             vector[i] = 1
>
>
I assume that this is preceded by something like:
vector = {}

Here you are filling the vector map with words as keys
(not their positions), and their frequency counts as values.

Aside: using a bare except is frowned upon: it will catch
*every* error. What you probably want is something like:

try:
    vector[i] += 1
except KeyError:
    vector[i] = 1

>
>How to I store the corresponding location of the word
>in the dictionary which is matching
>
Again, why would you need this?

>
>and then I will use this location as the key of the
>vector and the value is being computed by vector[i]
>
And here is where the misunderstanding is. vector is keyed by the word
itself, not by its position in the dictionary.

><I NEED TO CHANGE SOMETHING HERE>
>
>for v,i in enumerate(vector):
>
Confusing naming. enumerate(iterable) gives a list of
(index, value) tuples.
When you enumerate a map, the index values are of course sequential
integers, starting with 0, but the values are essentially
arbitrary keys from the mapping.
So, here you have:
v == irrelevant integer index value,
and i == arbitrary key (=word) from vector

>    vector[i] /= a
>
I assume a is an integer, otherwise this would fail.
You divide the count for the word i by a (total number
of words in msg?).

>    #print v,i, vector[i] ; if u want to see the word
>too that was commmon
>
This would print
<irrelevant index number>, <word>, <wordcount divided by a>

>    print v,":",vector[i]
>
This would print
<irrelevant index number>: <wordcount divided by a>

That still leaves the question: why would you want to
relate the words in the message to their *position* in
a dictionary?
I still have not seen anything where you would need that.

Regards,

Ruud.




More information about the Python-list mailing list