[Tutor] Is there a simpler way to remove from a set?
Leam Hall
leamhall at gmail.com
Sat May 8 19:48:56 EDT 2021
On 5/8/21 6:21 PM, dn via Tutor wrote lots of good questions, to include:
> Are you allowing yourself to be 'railroaded' or 'blinkered' by the spec?
Lots to think about! The good news is that I'm the one coming up with
the requirement, so the issues are relevant. The bad news is that I'm
the one coming up with the requirement, so the processes are limited to
my understanding.
The 'purge' option is desired, but is the second priority. The origin of
the issue comes from my printer not liking Linux, it can't print a plain
text document. I got into the habit of writing docx files in Libre/Open
Office, and making backup copies. And archive copies. And backups of the
archives...
A few months ago I admitted there was an issue. The solution is to
convert everything to text and put it into version control. The first
priority of the program is to find all copies of files in the given
directories, and sort them into two lists; "only one version of this
file" and "more than one version of this file". The file size is the
definition of "different version".
With those lists I can convert the single version files into text and
put them into version control. Then I figure out which of the "multiple
versions" is the right version, convert that and then stuff it into
version control. This process is manual, which is why the "purge" comes
afterwards. I can get the two lists, make sure everything is in version
control, and they purge afterwards.
The (old version) code walks down the given directories; it uses a queue
to track which directories have not been checked for files. Does that
make sense?
On 5/8/21 6:19 AM, Alan Gauld via Tutor wrote:
> What I asked for was details of how "Python complained"
> when you tried to do it without building 2 sets.
My apologies for the misunderstanding! When I saw your request for the
code, I took it to mean my code, not the error code.
I've already edited that bit out, but it seemed that the set got upset
if the number of items in the set changed during iteration. I couldn't
remember how to break out of two levels of iteration:
for dir in dirs:
for exclude_dir in exclude_dirs:
if dir.startswith(exclude_dir):
# Remove it and go to the next 'for dir in dirs'
# At this point it complained about dirs changing, IIRC.
The combination of not knowing how to break out of multiple levels, and
the iterator not liking the changes of life, led me to the use of two sets.
Leam
--
Site Reliability Engineer (reuel.net/resume)
Scribe: The Domici War (domiciwar.net)
General Ne'er-do-well (github.com/LeamHall)
More information about the Tutor
mailing list