Lining Up and PaddingTwo Similar Lists

Fri Aug 29 02:02:00 EDT 2008

On Aug 29, 12:29 am, "W. eWatson" <notval... at sbcglobal.net> wrote:
> castironpi wrote:
>
> > This gets you your list.  What do you mean by 'missing member of
>
> (a.dat, a.txt) is a pair. (None, a.txt) has a.dat missing. I just need to
> issue a msg to the user that one member of a file pair is missing. Both
> files need to be present to make sense of the data.> pairs'?  If you mean, 'set of elements that appear in both' or 'set
> > that appears in one but not both', you can short circuit it at line
> > 14.
>
> > -warning, spoiler-
>
> It looks like you went beyond the call of duty, but that's fine. It looks
> like I have a few new features to learn about in Python. In particular,
> dictionaries. Thanks.
>
> Actually, the file names are probably in order as I pick them up in XP. I
> would think if someone had sorted the folder, that as one reads the folder
> they are in alpha order, low to high.
>
> --
>                                     W. Watson
>               (121.015 Deg. W, 39.262 Deg. N) GMT-8 hr std. time)
>                Obz Site:  39° 15' 7" N, 121° 2' 32" W, 2700 feet

I don't think that's guaranteed by anything.  I realized that
'dat.sort()' and 'txt.sort()' weren't necessary, since their contents
are moved to a dictionary, which isn't sorted.

both= set( datD.keys() )& set( txtD.keys() )

This will get you the keys (prefixes) that are in both.  Then for
every prefix if it's not in 'both', you can report it.

Lastly, since you suggest you're guaranteed that 'txt' will all share
the same extension, you can do away with the dictionary and use sets
entirely.  Only if you can depend on that assumption.

I took a look at this.  It's probably more what you had in mind, and
the dictionaries are overkill.

import os.path
dat= ['a.dat', 'c.dat', 'g.dat', 'k.dat', 'p.dat']
datset= set( [ os.path.splitext( x )[ 0 ] for x in dat ] )
print datset
txt= ['a.txt', 'b.txt', 'g.txt', 'k.txt', 'r.txt', 'w.txt']
txtset= set( [ os.path.splitext( x )[ 0 ] for x in txt ] )
print txtset
both= txtset & datset
for d in datset- both:
	print '%s.dat not matched'% d
for t in txtset- both:
	print '%s.txt not matched'% t

OUTPUT:

set(['a', 'p', 'c', 'k', 'g'])
set(['a', 'b', 'g', 'k', 'r', 'w'])
p.dat not matched
c.dat not matched
r.txt not matched
b.txt not matched
w.txt not matched