[Tutor] How to get script to detect whether a file exists?

Steven D'Aprano steve at pearwood.info
Tue Aug 3 01:57:13 CEST 2010


On Tue, 3 Aug 2010 08:42:01 am Richard D. Moores wrote:
> On Mon, Aug 2, 2010 at 14:28, Hugo Arts <hugo.yoshi at gmail.com> wrote:
> > On Mon, Aug 2, 2010 at 8:58 PM, Richard D. Moores 
<rdmoores at gmail.com> wrote:
> >> OK, here's my attempt: <http://tutoree7.pastebin.com/3YNJLkYc>.
> >>  Better?
> >
> > Much better. But why is that F in the argument list of both
> > functions? It's overwritten immediately with the open() call, so it
> > seems unnecessary to include in the arguments.
>
> I need it for lines 36 and 37, don't I?
>
> Here's the latest incarnation:
> <http://tutoree7.pastebin.com/7tDpX2nT>


One hint is that functions that do similar things should have similar 
signatures. (The signature of a function is the list of arguments it 
takes.) So create_pickle_file and repickling do similar things, they 
should look similar. The F argument isn't used, it's just immediately 
thrown away and replaced, so drop it:

def create_pickle_file(path, D):
    F = open(path, 'wb')
    pickle.dump(D, F)
    F.close()
 
def repickling(path, D):
    F = open(path, 'wb')
    pickle.dump(D, F)

Now, let's look carefully at the bodies of the functions. The first 
shares 2 lines out of 3 with the second. The second shares 100% of it's 
body with the first. The *only* difference is a trivial one: in the 
first function, the opened file is explicitly closed, while in the 
second function, the opened file is automatically closed.

In other words, your two functions do EXACTLY the same thing. Let's 
throw one out, and give the survivor a nicer name and some help text 
and documentation (although in this case the function is so simple it 
hardly needs it):

def save(path, D):
    """Open file given by path and save D to it.

    Returns nothing.
    """
    F = open(path, 'wb')
    pickle.dump(D, F)
    F.close()

Notice that the function name says *what* it does ("save") rather than 
*how* it does it ("pickling"). You might change your mind and decide 
later that instead of pickle you want to use an INI file, or JSON, or 
XML, or something else.

Also, I prefer to explicitly close files, although on a small script 
like this it makes no real difference. Feel free to leave that line 
out.

Now how do you use it? Let's look what you do next:

try:
    F_unused = open(path1, 'rb')
    F_used = open(path2, 'rb')
except IOError:
    unused_ints = [x for x in range(1, record_label_num_pages + 1)]
    used_ints = []
    create_pickle_file(path1, 'F_unused', unused_ints)
    create_pickle_file(path2, 'F_used', used_ints)
    print("Pickle files have been created.") 
    print()

There's a few problems there. For starters, if *either* file is missing, 
BOTH get re-created. Surely that's not what you want? You only want to 
re-create the missing file. Worse, if an IOError does occur, the 
variables F_used and F_unused never get set, so when you try to used 
those variables later, you'll have a problem.

I'm not sure why you need to store both used and unused numbers. Surely 
you can calculate one from the other? So here's my thoughts...


record_label_name = "ABC_Classics"
record_label_num_pages = 37

used_ints_pickle_filename = record_label_name + "_used_ints.pkl"
path = '/p31working/Pickles/' + used_ints_pickle_filename

# These are the available page numbers.
pool = range(1, record_label_num_pages + 1)

def save(path, D):
    """Open file given by path and save D to it.

    Returns nothing.
    """
    F = open(path, 'wb')
    pickle.dump(D, F)
    F.close()

def load(path):
    """Open file given by path if it exists, and return its contents.
    If it doesn't exist, save and return the default contents.
    """
    try:
        f = open(path, 'r')
    except IOError:
        # File *probably* doesn't exist. Consider better error checking.
        data = []
        save(path, data)
    else:
        data = pickle.load(f)
        f.close()
    return data

used_page_numbers = load(path)
unused_page_numbers = [n for n in pool if n not in used_page_numbers]
if not unused_page_numbers:
    print("All pages checked.")
    print("Program will now close.")
    sleep(2.1)
    sys.exit()


and now that you have a list of unused page numbers, continue on with 
the rest of your program.



-- 
Steven D'Aprano


More information about the Tutor mailing list