What is a perl hash in python

Karyn Williams karyn at calarts.edu
Fri Jan 12 16:14:10 EST 2007


At 06:54 PM 1/12/07 GMT, you wrote:
>On Fri, 12 Jan 2007 09:15:44 -0800, Karyn Williams <karyn at calarts.edu>
>declaimed the following in comp.lang.python:
>
>> I am new to Pyton. I am trying to modify and understand a script someone
>> else wrote. I am trying to make sense of the following code snippet. I know
>
>	"someone else" didn't write Python either, looking at that mishmash
><G>

Thanks, Marc and Dennis. 

Actually as I think about it, this operation should be able to be done in
one loop, not the ten or so that it is currently taking. 


Read in a file "*.log" (excluding certain named files "1.log"), total up x
number of rows of the 2nd and third columns, push (filename, total col2)
(filename, total col 3) to two lists, sort -r and generate one web page
each with the top ten.

That is what this script is supposed to be doing.

>> line 7 would be best coded with regex. I first would like to understand
>> what was coded originally. thelistOut looks like a hash to me (I'm more
>> familiar with perl). Perhaps someone could translate from perl to python
>> for me - not in code but just in concept.
>> 
>> 
>> Here is the code. This script is reading the list thelistOut and then
>> removing any items in RSMlist and taking the remainder and putting them in
>> graphAddressOut with the formatting.
>> 
>> This is a SAMPLE of what is in the lists referenced below in the loop:
>> 
>> 
>> thelistOut = [(632,
>> ['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
>> ['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
>> ['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]
>>
>
>	This is a list containing three elements. Each element is a tuple
>containing two sub-elements. The first sub-element appears to be an
>integer (I have no idea of the significance of the value at this time).
>The second sub-element is another list containing a single
>sub-sub-element -- that sub-sub-element is a string (file path name).
> 
>> RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
>> '172.16.0.1_5']
>> 
>> 
>> 
>> #--------------------------Loop 1 -------------------------
>>        
>>     w = 0
>>     while w < 45:
>>
>	for w in xrange(45):
>        
>>        fileOut = string.split(thelistOut[w][1][0],".log")
>>        fileOutSplitedCommon = string.split(fileOut[0], "main/")
>>        fileOut2D = string.split(fileOutSplitedCommon[1], "/")
>>        fileOut = string.split(fileOut[0],"data-dist")
>>
>	Direct use of the string module is now frowned upon. 


For future reference, why is direct use of the string module frowned upon,
and what does one use instead ? 


>Also, since these are file path names, using operations in the os.path
module would
>be more appropriate...


I'll look into os.path, but what this loop should be doing is matching and
removing the entries from thelistOut ( and thelistIn ) 
that are listed in RSMlist. Or as is being done, not writing them to the
new list, outputOut (graphAddressOut). 
Its just a matching operation, not really a path/filename op. This is why I
will be changing this to a regex.


>>        if fileOut2D[1] in RSMList:
>>           w = w + 1
>>           continue
>
>	Confusing logic, having two places where "w" is incremented. Using a
>"for" loop would mean neither increment statement is needed. Actually,
>"w" isn't even needed, replace the while/for with
>
>	for fid in thelistOut:
>		fileOut = fid[1][0]	#that [1] is getting the second element of
>
>the tuple, and the [0] gets the string out of that list (why a list of
>one element string data?)
>
>>        graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
>> "<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "  &
>> nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
>> logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5
>
>	This could be cleaned up too, but I'll ignore it at the moment.
>
>>        outputOut.append(graphAddressOut)
>>        strOut = strOut + graphAddressOut
>> 
>>        w = w + 1
>> 
>> #--------------------------Loop 1 -------------------------
>
>
>	I think what you call a "hash" in PERL is a dictionary in Python:
>
>dct = { key1 : value1, ... , keyn : valuen }
>
>aval = dct[keyx]
>
>	Nothing of the sort used in the code you show above.



-- 

Karyn Williams
Network Services Manager
California Institute of the Arts
karyn at calarts.edu
http://www.calarts.edu/network



More information about the Python-list mailing list