[Tutor] combinations of all rows and cols from a dataframe

ThreeBlindQuarks threesomequarks at proton.me
Fri Mar 31 10:32:17 EDT 2023


I ask people posting questions to put themselves into the position of the ones being asked about it and try to give enough info.

There are many reasons people give just a little info, including not wanting us to see it is homework, or not being noticed by others near them as asking for help and so on.

We now have two somewhat related questions on the table and both are not complete. I doubt the original problem is actually as simple as to take this specific 2x2 object and create a nested list structure containing exactly these contents. There is a trivial and useless solution to such one-time problems consisting of a print statement of the typed-in answer.

A serious question might look like: given a structure like a Dataframe with M rows and N columns, with both being at least 2 in length (or whatever) then make all combinations ... and make a list of that output.

Then provide one or a few examples that can be used to see if the program written to solve it works. If needed, provide criteria needed to see if the answer that may look right, is still wrong in the general case.

As an example, the first question looked like it wanted duplicates that were mirror images of each other (or perhaps just contained the same members in different orders) removed. But maybe it wanted a specific one kept as in the sorted in order one. One thought was to use a python set as an intermediate data structure to consolidate duplicates. But that has lots of considerations and may require some care AND it has an obvious flaw in that multiple copies of the same number may be swallowed. I won't say more except that the lack of clearer and even somewhat abstract requirements makes it hard to know what a proper solution might be.

If someone had provided some motivation of a real-world problem, who knows? What if I had a question about say a group of friends who entered a room in pairs shown as two columns and later walked out sometimes alone and sometimes in groups of two or three or perhaps more and you want to know which could happen. Sounds like all combinations? But what if on the way out, the guards checked and only allowed groups out that contained no more than one person from a group while allowing any number from the other group, as a way to foster some sort of interaction. You could have either only one person from column A or only one from column B. I MADE THIS EXAMPLE UP, but this could be an example that helps people understand what you are hoping to do and maybe create several test cases and see if the algorithm handles them.

As currently stated, no motivation has been given why anyone wants to do this. Sadly, this is not uncommon especially in homework assignments.

But as noted, the goal here is not doing homework for people but rather helping people to learn to help themselves. A narrower question might have been that your algorithm returned doubles, so how do you remove the doubles. Or maybe the algorithm is returning empty lists and you wonder how to change it to avoid that.

I won't go on by trying to solve the problem but want to point out the subject of these messages is about a Dataframe and wanting combinations of rows and columns. That may be a bit deceptive as it looks like one possible idea here is not at all about dataframes. It sounds like the question is given two (possibly more) collections of unrelated data except perhaps all of the same size, take one collection at a time and then connect each item with some kind of partial permutation of the second/other  (or maybe more) collections and then finally consolidate the results into a list containing sublists as needed.

If the above makes sense, and perhaps it is nonsense, then it might suggest a different approach in which the data read in as a Dataframe is rapidly copied into other data structures and manipulated.

The volunteers here will work with questions asked even if they are not already perfectly presented, of course, but it can be way more effective if they know what they are being asked up front. And I find that the person asking can often figure out their own answer in the process of trying to explain it well to others. 

- Q







Sent with Proton Mail secure email.

------- Original Message -------
On Friday, March 31st, 2023 at 7:55 AM, Peter Otten <__peter__ at web.de> wrote:


> On 31/03/2023 10:06, Roel Schroeven wrote:
> 
> > Op 30/03/2023 om 13:41 schreef marc nicole:
> > 
> > > @Peter what you did is more or less what i was looking for except that i
> > > see "duplicate" tuples e.g., ([2], [6, 5]) & ([6, 5], [2]) which are
> > > unwanted in the final output
> 
> > There where duplicates in your example output too: you had both [[2],
> > [6]] and [[6], [2]], and [[2], [5]] and [[5], [2]]. That's probably one
> > of the reasons people asked you to state the requirements clearly.
> 
> 
> As Roel says. I gave you a possible solution with no duplicates and
> another with all duplicates, but your reference solution has some
> duplicates.
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor


More information about the Tutor mailing list