[Tutor] Required help in understanding following code

Alan Gauld alan.gauld at yahoo.co.uk
Sat Nov 21 06:26:27 EST 2020


On 21/11/2020 04:00, Shubhangi Patil via Tutor wrote:

> I need guidance to understand following code 
> 
Me too. That's one of the most convoluted list comprehensions
I've seen

I've tried to reformat it.

> image_data = [{"data": typ,
>                "class": name.split('/')[0],
>                "filename": name.split('/')[1]}
>              for dataset, typ in zip([train_dataset, validation_dataset, test_dataset],                                        ["train", "validation", "test"])
>              for name in dataset.filenames]
> 
> image_df = pd.DataFrame(image_data)

But its still pretty complex so it might be better to unwind the
comprehension and I'll add comments to explain it.

# create an empty list to receive the dictionaries
# we are about to create.
image_data = []

# create a list of tuples of the form (dataset,"name")
for dataset, type in zip(...as above...):

# now extract file names from each dataset
    for name in dataset.filenames:

# create a dictionary with the extracted data
       data = {
          "data":type,    # one of: train, validation, test
          "class": name.split('/')[0], # first part of filename path(*)
          "filename":name.split(/)[1] # second part of filename path(*)
       }
       image_data.append(data)  # add it to the list

# create a Pandas data frame using out list of dicts
image_df = pd.DataFrame(image_data)

(8) - Note that this is incredibly fragile and should probably
use the os.path module to extract the filename and path.
The solution above relies on a very specific file structure.

> In above code instead of zip files, 

The code above does not use zip files. The zip() function "zips"
two sequences of data together. Experiment in the >>> prompt to
see how it works:

>>> list(zip([1,2,3],['a','b','c']))
[(1, 'a'), (2, 'b'), (3, 'c')]

> I would like to upload files from my computer folder. 

That's what this seems to be doing although the exact nature
of the data format is not clear. But somehow you would need
to set the 'filenames' attribute of dataset.filenames.
But where the three datasets are coming from is not clear
in the code snippet you sent.

> Please explain in details above code.

Hopefully my deconstruction and explanation makes sense.
But since it is not complete code we can't be sure what
you need to do to meet your aims.

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos




More information about the Tutor mailing list