Strategy/ Advice for How to Best Attack this Problem?

Wed Apr 1 19:51:47 EDT 2015

On 04/01/2015 09:43 AM, Saran A wrote:
> On Tuesday, March 31, 2015 at 9:19:37 AM UTC-4, Dave Angel wrote:
>> On 03/31/2015 07:00 AM, Saran A wrote:
>>
>>   > @DaveA: This is a homework assignment. .... Is it possible that you
>> could provide me with some snippets or guidance on where to place your
>> suggestions (for your TO DOs 2,3,4,5)?
>>   >
>>
>>
>>> On Monday, March 30, 2015 at 2:36:02 PM UTC-4, Dave Angel wrote:
>>
>>>>
>>>> It's missing a number of your requirements.  But it's a start.
>>>>
>>>> If it were my file, I'd have a TODO comment at the bottom stating known
>>>> changes that are needed.  In it, I'd mention:
>>>>
>>>> 1) your present code is assuming all filenames come directly from the
>>>> commandline.  No searching of a directory.
>>>>
>>>> 2) your present code does not move any files to success or failure
>>>> directories
>>>>
>>
>> In function validate_files()
>> Just after the line
>>                   print('success with %s on %d reco...
>> you could move the file, using shutil.  Likewise after the failure print.
>>
>>>> 3) your present code doesn't calculate or write to a text file any
>>>> statistics.
>>
>> You successfully print to sys.stderr.  So you could print to some other
>> file in the exact same way.
>>
>>>>
>>>> 4) your present code runs once through the names, and terminates.  It
>>>> doesn't "monitor" anything.
>>
>> Make a new function, perhaps called main(), with a loop that calls
>> validate_files(), with a sleep after each pass.  Of course, unless you
>> fix TODO#1, that'll keep looking for the same files.  No harm in that if
>> that's the spec, since you moved the earlier versions of the files.
>>
>> But if you want to "monitor" the directory, let the directory name be
>> the argument to main, and let main do a dirlist each time through the
>> loop, and pass the corresponding list to validate_files.
>>
>>>>
>>>> 5) your present code doesn't check for zero-length files
>>>>
>>
>> In validate_and_process_data(), instead of checking filesize against
>> ftell, check it against zero.
>>
>>>> I'd also wonder why you bother checking whether the
>>>> os.path.getsize(file) function returns the same value as the os.SEEK_END
>>>> and ftell() code does.  Is it that you don't trust the library?  Or that
>>>> you have to run on Windows, where the line-ending logic can change the
>>>> apparent file size?
>>>>
>>>> I notice you're not specifying a file mode on the open.  So in Python 3,
>>>> your sizes are going to be specified in unicode characters after
>>>> decoding.  Is that what the spec says?  It's probably safer to
>>>> explicitly specify the mode (and the file encoding if you're in text).
>>>>
>>>> I see you call strip() before comparing the length.  Could there ever be
>>>> leading or trailing whitespace that's significant?  Is that the actual
>>>> specification of line size?
>>>>
>>>> --
>>>> DaveA
>>>
>>>
>>
>>> I ask this because I have been searching fruitlessly through for some time and there are so many permutations that I am bamboozled by which is considered best practice.
>>>
>>> Moreover, as to the other comments, those are too specific. The scope of the assignment is very limited, but I am learning what I need to look out or ask questions regarding specs - in the future.
>>>
>>
>>
>> --
>> DaveA
>
> @DaveA
>
> My most recent commit (https://github.com/ahlusar1989/WGProjects/blob/master/P1version2.0withassumptions_mods.py) has more annotations and comments for each file.

Perhaps you don't realize how github works.  The whole point is it 
preserves the history of your code, and you use the same filename for 
each revision.

Or possibly it's I that doesn't understand it.  I use git, but haven't 
actually used github for my own code.

>
> I have attempted to address the functional requirements that you brought up:
>
> 1) Before, my present code was assuming all filenames come directly from the commandline.  No searching of a directory. I think that I have addressed this.
>

Have you even tried to run the code?  It quits immediately with an 
exception since your call to main() doesn't pass any arguments, and main 
requires one.

 > def main(dirslist):
 >     while True:
 >         for file in dirslist:
 >         	return validate_files(file)
 >         	time.sleep(5)

In addition, you aren't actually doing anything to find what the files 
in the directory are.  I tried to refer to dirlist, as a hint.  A 
stronger hint:  look up  os.listdir()

And that list of files has to change each time through the while loop, 
that's the whole meaning of scanning.  You don't just grab the names 
once, you look to see what's there.

The next thing is that you're using a variable called 'file', while 
that's a built-in type in Python.  So you really want to use a different 
name.

Next, you have a loop through the magical dirslist to get individual 
filenames, but then you call the validate_files() function with a single 
file, but that function is expecting a list of filenames.  One or the 
other has to change.

Next, you return from main after validating the first file, so no others 
will get processed.

> 2) My present code does not move any files to success or failure directories (requirements for this assignment1). I am still wondering if and how I should use shututil() like you advised me to. I keep receiving a syntax error when declaring this below the print statement.

You had a function to do that in your very first post to this thread. 
Have you forgotten it already?  As for syntax errors, you can't expect 
any help on that when you don't show any code nor the syntax error 
traceback.  Remember that when a syntax error is shown, it's frequently 
the previous line or two that actually was wrong.

>
> 3) You correctly reminded me that my present code doesn't calculate or write to a text file any statistics or errors for the cause of the error.

 > #I wrote an error class in case the I needed such a specification for 
a notifcation - this is not in the requirements
 > class ErrorHandler:
 >
 >     def __init__(self):
 >         pass
 >
 >     def write(self, string):
 >
 >         # write error to file
 >         fname = " Error report (" + time.strftime("%Y-%m-%d %I-%M%p") 
+ ").txt"
 >         handler = open(fname, "w")
 >         handler.write(string)
 >         handler.close()

This looks like Java code.  With no data attributes, and only one 
method(function), what's the reason you don't just use a function?

original_stderr = sys.stderr # stored in case want to revert
sys.stderr = ErrorHandler()

This is just bogus.  There are rare times when a programmer might want 
to replace stderr, but that's just unnecessarily confusing in a program 
like this one.  When you want to write printable data to another file, 
just use that file's handle as the argument to the file= argument of 
print().  You could use an instance of ErrorHandler as a file handle. 
Or do it one of a dozen other straightforward ways.

While we're at it, why do you keep creating new files for your error 
report?  And overwriting all the old messages with whatever new ones 
come in the same minute?

> (Should I use the copy or copy2 method in order provide metadata? If so, should I wrap it into a try and except logic?)

No idea what you mean here.  What metadata?  And what do copy and copy2 
have to do with anything here?

>
> 4) Before, my present code runs once through the names, and terminates.  It doesn't "monitor" anything. I think I have addressed this with the  main function - correct?
>
Not even close.

> 5) Before, my present code doesn't check for zero-length files  - I have added a comment there in case that is needed)
>
> I realize appreciate your invaluable feedback. I have grown a lot with this assignment!
>
> Sincerely,
>
> Saran
>

-- 
DaveA