List problem

User 1 at 2.3
Sun Oct 31 21:32:51 EST 2004


On Sun, 31 Oct 2004 14:28:39 +0200, aleaxit at yahoo.com (Alex Martelli)
wrote:

>User <1 at 2.3> wrote:
>   ...
>> >In general, removing elements from a list while you're iterating though
>> >the list is a bad idea.  Perhaps a better solution is a list
>> >comprehension:
>> 
>> Unless you're using a while loop and iterating in reverse, for
>> example:
>> 
>> a = [0,1,2,3,4]
>> pos_max = len(a) - 1  # Maximum iterable element
>> pos = pos_max         # Current element
>> while pos >= 0:
>>       if a[pos] == 2:
>>               a.remove(2)
>>               pos_max = pos_max - 1
>>       pos = pos - 1
>
>Still a bad idea.
>
>    a[:] = [x for x in a if x != 2]
>
>is more concise, idiomatic, and speedy.  No reason to do low-level
>twiddling with indices and fiddly loops, in cases where a list
>comprehension can do the job so simply.
>
>
>Alex


Interesting piece of code.  In the example I gave, I was really just
trying to simplify a rather messy function I wrote for checking a list
of files for duplicates.  It is messy because it does two passes (read
up to a user specified maximum, if they still match, read entire files
and compare).  I use this approach because I believe it is most
efficient in terms of not making any more comparisons than necessary.
I initially attempted to do this with a "for" loop, but ended up with
a mess when I started taking things out of it.  Anyway, here is the
original function:


def dupe_check(input_list, init_read):

# Function checks a list of files for duplicates
# Input list Format is ['c:\\junk\\trash.txt',
'd:\\stuff\\junk.jpg'......]
# init_read = max file size read on first pass.
# function returns a list of lists of matching files

  a = input_list
  tot = 0;	fail = 0
  q = [];	pos = 0;	len_a = len(a) - 1
  while len_a >= 0:
    file1 = a[len_a]
    dx = read_close(file1, 'rb', init_read)
    q.append([a[len_a]])
    a.remove(a[len_a])		
    len_a = len_a - 1
    len_sub = len_a
    while len_sub >= 0:
      dy = read_close(a[len_sub], 'rb', init_read)
      if dx == dy:
        tot = tot + 1
        if read_close(a[len_sub], 'rb', -1) == read_close(file1, 'rb',
-1):
          q[pos].append(a[len_sub])
          a.remove(a[len_sub])
	  len_a = len_a - 1 
        else:
          fail = fail + 1
      len_sub = len_sub - 1    
    pos = pos + 1

# copy only lists of two or more from q to z

  z = []
  for x in q:
    if len(x) > 1:
      z.append(x)  
  if tot > 0:
    perc_fails =  100.0 * float(fail)/float(tot)
  else:
    perc_fails = 0.0
  return z, perc_fails 





def read_close(file, mode, max_read):
# Opens a file, reads its data, closes
# files, then returns data

  f = open(file, mode)
  data = f.read(max_read)
  f.close()
  return data




More information about the Python-list mailing list