All permutations from 2 lists

Avi Gross avigross at verizon.net
Thu Mar 3 12:00:39 EST 2022


Larry,

That explanation made more sense and provided context.

I fully agree with you that generating the cross product of multiple lists can be messy and large and best avoided.

As an example, someone on an R forum presented their version of a way to see what are potential solutions to the game WORDLE at any given time, given the current constraints. The details are not important except that their process makes multiple vectors of the characters that can be allowed for letters "one" through "five" and then generates a data.frame of all combinations. Early in the game you know little and the number of combinations can be as high as 26*26*26*26*26 in the English version. Within a few moves, you may know more but even 15*18*... can be large. So you have data.frames with sometimes millions of rows that are then converted rowwise to five letter words to make a long vector than you query a huge dictionary for each word and produce a list of possible words. Now imagine the same game looking instead for 6 letter words and 7 letter words ...

I looked at it and decided it was the wrong approach and in, brief, made a much smaller dictionary containing only the five letter words, and made a regular expression that looked like"

"^[letters][^letters]S[more][^this]$"

The above has 5 matches that may be for a specific letter you know is there (the S in position 3) or a sequence of letters in square brackets saying any one of those match, or the same with a leading caret saying anything except those. You then simply use the R grep() function to search the list of valid 5-letter words using that pattern and in one sweep get them all without creating humongous data structures.

What you describe has some similarities as you searched for an alternate way to do something and it is now clearer why you did not immediately vocalize exactly what you anticipated. But your solution was not a solution to what anyone trying to help was working on. It was a solution to a different problem and what people would have had to know about how you were using a dictionary to pass to a mysterious function was not stated, till now. I would have appreciated it if you had simply stated you decided to use a different way and if anyone is curious, here it is.

For the rest of us, I think what we got from the exchange may vary. Some saw it as a natural fit with using something like a nested comprehension, albeit empty lists might need to be dealt with. Others saw a module designed to do such things as an answer. I saw other modules in numpy/pandas as reasonable. Some thought iterators were a part of a solution. The reality is that making permutations and combinations is a fairly common occurance in computer science and it can be expected that many implement one solution or another. 

But looking at your code, I am amused that you seem to already have not individual lists but a dictionary of named lists. Code similar to what you show now could trivially have removed dictionary items that held only an empty list. And as I pointed out, some of the solutions we came up with that could generalize to any number of lists, happily would accept such a dictionary and generate all combinations. 

My frustration was not about you asking how to solve a very reasonable problem in Python. It was about the process and what was disclosed and then the expectation that we should have known about things not shared. Certainly sharing too much is a problem too. Your title alone was very concrete asking about 2 lists. It is clear that was not quite your real need.





-----Original Message-----
From: Larry Martell <larry.martell at gmail.com>
To: Avi Gross <avigross at verizon.net>
Cc: python-list at python.org <python-list at python.org>
Sent: Thu, Mar 3, 2022 9:07 am
Subject: Re: All permutations from 2 lists


On Wed, Mar 2, 2022 at 9:42 PM Avi Gross via Python-list
<python-list at python.org> wrote:
>
> Larry,
>
> i waited patiently to see what others will write and perhaps see if you explain better what you need. You seem to gleefully swat down anything offered. So I am not tempted to engage.

But then you gave in to the temptation.

> And it is hard to guess as it is not clear what you will do with this.

In the interests of presenting a minimal example I clearly
oversimplified. This is my use case: I get a dict from an outside
source. The dict contains key/value pairs that I need to use to query
a mongodb database. When the values in the dict are all scalar I can
pass the dict directly into the query, e.g.:
self._db_conn[collection_name].find(query). But if any of the values
are lists that does not work. I need to query with something like the
cross product of all the lists. It's not a true product since if a
list is empty it means no filtering on that field, not no filtering on
all the fields.  Originally I did not know I could generate a single
query that did that. So I was trying to come up with a way to generate
a list of all the permutations and was going to issue a query for each
individually.  Clearly that would become very inefficient if the lists
were long or there were a lot of lists. I then found that I could
specify a list with the "$in" clause, hence my solution.


> def query_lfixer(query):
>     for k, v in query.items():
>         if type(v)==list:
>             query[k] = {"$in": v}
>     return query
>
> self._db_conn[collection_name].find(query_lfixer(query))
>
>
> So why did so many of us bother?


Indeed - so why did you bother?




More information about the Python-list mailing list