[Tutor] Is there a simpler way to remove from a set?

Leam Hall leamhall at gmail.com
Sat May 8 19:48:56 EDT 2021


On 5/8/21 6:21 PM, dn via Tutor wrote lots of good questions, to include:

> Are you allowing yourself to be 'railroaded' or 'blinkered' by the spec?

Lots to think about! The good news is that I'm the one coming up with 
the requirement, so the issues are relevant. The bad news is that I'm 
the one coming up with the requirement, so the processes are limited to 
my understanding.

The 'purge' option is desired, but is the second priority. The origin of 
the issue comes from my printer not liking Linux, it can't print a plain 
text document. I got into the habit of writing docx files in Libre/Open 
Office, and making backup copies. And archive copies. And backups of the 
archives...

A few months ago I admitted there was an issue. The solution is to 
convert everything to text and put it into version control. The first 
priority of the program is to find all copies of files in the given 
directories, and sort them into two lists; "only one version of this 
file" and "more than one version of this file". The file size is the 
definition of "different version".

With those lists I can convert the single version files into text and 
put them into version control. Then I figure out which of the "multiple 
versions" is the right version, convert that and then stuff it into 
version control. This process is manual, which is why the "purge" comes 
afterwards. I can get the two lists, make sure everything is in version 
control, and they purge afterwards.

The (old version) code walks down the given directories; it uses a queue 
to track which directories have not been checked for files. Does that 
make sense?


On 5/8/21 6:19 AM, Alan Gauld via Tutor wrote:
 > What I asked for was details of how "Python complained"
 > when you tried to do it without building 2 sets.

My apologies for the misunderstanding! When I saw your request for the 
code, I took it to mean my code, not the error code.

I've already edited that bit out, but it seemed that the set got upset 
if the number of items in the set changed during iteration. I couldn't 
remember how to break out of two levels of iteration:

   for dir in dirs:
     for exclude_dir in exclude_dirs:
       if dir.startswith(exclude_dir):
         # Remove it and go to the next 'for dir in dirs'
         # At this point it complained about dirs changing, IIRC.

The combination of not knowing how to break out of multiple levels, and 
the iterator not liking the changes of life, led me to the use of two sets.

Leam

-- 
Site Reliability Engineer  (reuel.net/resume)
Scribe: The Domici War     (domiciwar.net)
General Ne'er-do-well      (github.com/LeamHall)


More information about the Tutor mailing list