[Tutor] How to refactor a simple, straightforward script into a "proper" program?

Alan Gauld alan.gauld at yahoo.co.uk
Sat Jan 4 18:43:16 EST 2020


On 04/01/2020 07:10, boB Stepp wrote:
> I now believe that I have a finalized version2.py that I am satisfied
> with. 

I've just joined this thread, but I believe the intent was to produce an
industrial standard module?

A few comments...

> <final version2.py>

Terrible name for a module! :-)
Rename it to express what it does.

> ============================================================================
> #!/usr/bin/env python3
> """Calculate the number of pages per day needed to read a book by a
> given date."""

Give some examples. eg. If a book has 400 pages and there is one day it
will be 400, if 20 days, 20 pages/day. Something like that.

> 
> from datetime import date
> 
> 
> def get_input(str_converter, msg):
>     """Prompt user with a message and return user input with the
> correct data type."""

That's what it says it does but....

>     while True:
>         try:
>             return str_converter(input(msg))
>         except ValueError:
>             if str_converter == int:
>                 print("\nPlease enter a whole number!")
>             elif str_converter == date.fromisoformat:
>                 print("\nPlease enter a valid date in the following
> format yyyy-mm-dd!")

And if its not one of those types? I get a ValueError raised then it
does nothing. It just returns None. That's not very friendly.
Maybe you should mention that. And describe the type limitations
in the doc string. Or rename the function to get_number_input()

This sounds like a reusable function, but in fact its pretty tied into
the problem at hand.

> def get_input_params():
>     """Collect all needed input parameters."""

What? For anything? Wow! a magic function that I can use to get any
input parameters I ever need. Fantastic. Except.... That's not
really what it does.

Again this is not reusable code, its just a bit of the main function
broken out. Thats OK but you need to tell people that its problem
specific and maybe indicate its not reusable by renaming it
_get_input_params() or something?


>     while True:
>         str_converters = [int, int, date.fromisoformat]
>         input_msgs = [
>             "How many pages are there in your book?  ",
>             "How many pages have you read?  ",
>             "What is your date to finish the book (yyyy-mm-dd)?  ",
>         ]
>         num_pages_in_book, num_pages_read, goal_date = map(
>             get_input, str_converters, input_msgs
>         )
>         input_params = num_pages_in_book, num_pages_read, goal_date
>         if validate_input_params(*input_params):
>             return input_params
>         print()

Frankly this isn't helping with DRY, you could just as well put the
innards into main(). It is all specific to the main function, there is
only one call to the function. So you might as well put the code where
it is needed.

Personally, I think 3 calls to get_input() with the different
arguments would be nicer too. Its not repeated code because you are
passing different values. Its like using print(). Just because we call
print() multiple times doesn't mean we should build some kind of output
function that stores all of the output strings and then displays them,
with pauses at key points etc. It doable (via yield etc) but not good
code. We just call print() with different values.

> def validate_input_params(num_pages_in_book, num_pages_read, goal_date):
>     """Verify input parameters for logical correctness."""

Needs much more work to describe what goes in, what comes out and what
exceptions might be thrown.... see comments below on that topic.


>     valid_input = True
>     print()
>     if num_pages_in_book <= 0:
>         print("The number of pages in a book must be a positive integer!")
>         valid_input = False
>     if num_pages_read < 0:
>         print("The number of pages read must be a whole number!")
>         valid_input = False
>     if num_pages_read == num_pages_in_book and num_pages_in_book > 0:
>         print(
>             "You have already read the entire book!  Why are you
> running this program?"
>         )

Note, that last clause returns True. So even though it makes no sense
you consider these valid inputs? Maybe that needs to be explained in the
doc-string too.

But...
Validation functions really shouldn't have lots of prints in them.
Separate logic from presentation. This should either return a boolean.
Or you can throw suitable errors as found. Let the caller decide what
to tell the user. This function should only validate the inputs.

>     if num_pages_read > num_pages_in_book:
>         print("You cannot have read more pages than there are in the book!")
>         valid_input = False
>     if goal_date < date.today():
>         print("You cannot set a goal date in the past!")
>         valid_input = False
>     return valid_input
> 

Personally I'd opt to raise ValueErrors for te fails with the strings
attached. Then return True if the function passes (so you can use
it in a test).

> def calc_pages_per_day(num_pages_in_book, num_pages_read, goal_date):
>     """Return number of pages to read each day to attain goal."""

I'd remove the calc from the name. Call it after what it returns.
I'd rather read

ppd = pages_per_day(bp, rp, due)

than

ppd = calc_pages_per_day(bp,rp,due)

Maybe its just me but the calc seems implied by the function call.

>     COUNT_TODAY = 1
>     days_to_goal = (goal_date - date.today()).days
>     pages_per_day = (num_pages_in_book - num_pages_read) /
> (days_to_goal + COUNT_TODAY)
>     return pages_per_day

> def display_result(pages_per_day):
>     """Display how many pages per day the reader must read to finish
> the book."""
>     if 0 < pages_per_day <= 1:
>         word_to_display = "page"

Would you really say page if it was 0.5 pages per day?
I'd only use 'page' if it was exactly 1.

But since its a float you need to use an epsilon value.
So

e = 0.0001
if 1-e < pages_per_day < 1+e:
    word_to_display = 'page'
else: word_to_display = 'pages'

>     else:
>         word_to_display = "pages"
>     print(
>         "You must read {0:.2n} {1} per day to reach your goal.".format(
>             pages_per_day, word_to_display
>         )
>     )

> def main():
>     """Run program and display results."""
>     print(
>         'Welcome to "HOW MANY PAGES???"\n'
>         "Follow on-screen directions to determine how many pages to
> read each day to "
>         "finish your book.\n\n"
>     )

Why the mixed quotes? Make them all single quotes?
Or use one pair of triple quotes?

>     while True:
>         input_params = get_input_params()
>         display_result(calc_pages_per_day(*input_params))
>         run_again = input("Enter 'q' to quit or anything else to continue:  ")
>         if run_again.lower().startswith("q"):
>             break
> 
> if __name__ == "__main__":
>     main()

In summary:
1) expand the doc-strings to describe input/output/errors
   and limitations. Think about using the type inference
   style notation for showing input and return types.
2) if a function isn't generally reusable make it obvious
   by putting an underscore in front of the name. (or
   make it generally reusable...usually much harder :-)
3) separate presentation and logic.
4) use exceptions more

Finally, consider wider usage. eBooks don't have consistent
numbers of pages - it depends on font-size set by the user
etc. So maybe an option to use lines or characters instead
of pages? Or even percent as reported by a Kindle?

The calculations are the same.
The validation might need extra work.
The output display would be different - needing a unit added?


-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos



More information about the Tutor mailing list