my excel file is not updated to add new data

dn PythonList at DancesWithMice.info
Mon Jun 26 19:54:33 EDT 2023


Marc,

Apologies for writing in English (the language of this Discussion List). 
I have attempted to keep the code in French...

Thanks for this question. If it's OK with you, I would like to use it as 
an example-problem (and 'solution') challenge for some future Python 
Users' Group meeting (suitably translated, etc)...


On 27/06/2023 05.46, small marcc via Python-list wrote:

> This code creates the path to the Excel file where the data will be written. It checks if the file already exists, and if so, reads the existing data into a DataFrame. Otherwise, it creates a new empty DataFrame. Then it concatenates the existing data with the new data and writes the result to the Excel file. Finally, it displays a message that the backup was successful and closes the information window.
> 
> I do not understand why it does not work, the new data is not displayed in excel when I am in python tkinter, my data in python must be displayed in excel.
> I use the following code to write data to an excel file using pandas:
> 
> please help meHowever, when I run this code, the new data is not showing in the excel file. I don't understand why this isn't working. Can you help me solve this problem?

Is this a Python problem, or a problem with the way the solution has 
been coded?

There are several steps in the solution-description. It is troubling (to 
both of us) that you are unable to locate the source of the error. Why? 
Because it is not noticed until the final stage "new data is not displayed"!

NB am ignoring that the line defining "df" and everything thereafter, 
appears to be incorrectly indented - when viewed in my email-client. 
That said, the sequence of events seems muddled.


Please review the code. Compare it to the verbal/English-language 
description of the solution. Can each sentence in the solution be easily 
identified in the code?

Words like "and", "otherwise", "then", and "finally" indicate that there 
are multiple steps in the process. Let's split the separate sentences 
and phrases:

- creates the path to the Excel file where the data will be written
- checks if the file already exists
[establish DataFrame]
   - if so, reads the existing data into a DataFrame
   - Otherwise, it creates a new empty DataFrame
- concatenates the existing data with the new data
- writes the result to the Excel file
- displays a message that the backup was successful
- closes the information window.

NB addition of a task-label summarising the two DataFrame alternative 
situations.


Now, we can outline the Python code by writing a bunch of function 
definitions:

def create_data_path():
def check_for_file():
def establish_data_frame():
   def read_existing_data():
   def create_new_data_frame():
def accept_new_data():
def save_to_file():
def report():
def terminate():

NB my IDE would grumble and complain if I did this 'for real'. In a 'new 
code' situation, you may like to add a pass-statement inside each, and 
vertical-spacing to make the distracting-complaints 'go away'. 
Alternately and assuming the function-name doesn't say it all, add a 
docstring (you're going to add one anyway, right?) and expand upon the 
function-name with some explanation of what the function will do. In 
this situation, you could add the existing code.

NBB the two sub-tasks have been indented here for continuity of flow. 
They may/may not be nested within establish_data_frame() when you detail 
the content of each function - not important to this outline.

NBBB I've quickly drafted function-names for the sake of this 
discussion. I find it easier to name things when it is time to use them 
(in this case, call the function). Fortunately, a competent IDE will 
facilitate the refactoring of names ("symbols"). Yes, doing it that way 
costs (me) little extra effort - but it saves straining my brain 
up-front. That said, investing time in 'designing' a good name, will 
also tend to clarify and sharpen one's thinking. So, apparently I follow 
both paths... YMMV!


Now that the solution has been outlined, one can consider what output 
you expect from each function, what input it will require to produce 
that, and the transformation in-between. For example:

def create_data_path( nom_fichier ):
     * copy existing code into here
     return chemin_fichier_excel

(pausing at the one function, for now...)

[for non-French speakers:
nom_fichier = file_name
chemin_fichier_excel = data path including file-name
]


How has this design-process improved things?

1 clearer thinking: we have a simple and more straightforward idea of 
what should happen
2 "separation of concerns": each function does one job (and should do it 
well!)
3 testability: we can now ensure that the (each) function performs 
correctly (in isolation)
4 the problem of locating the source of an error is simplified (in fact, 
the "testability" advantage may solve this problem - or in a new-design 
situation, avoid it*)


Testing the function is carried-out by calling the function with a 
variety of example input-data, and ensuring the result is what you 
expect/want/require. Thus:

chemin = create_data_path( "testpython.xlsx" )
assert "chemin\testpython.xlsx" == chemin, "Not equal, but should be"

[for non-French speakers:
chemin = path
]


(an assert statement says: if the condition is False, print an optional 
message, and raise an exception/stop processing)

similarly:

chemin = create_data_path( "testpython.xlsx" )
assert "path_to_project\niaiserie.xlsx" != chemin, "Should not be equal"

[for non-French speakers:
niaiserie = some random name/silliness
]

Such tests should not be written as one-off code (the way one might add 
debug-printing - and then take it out again). If you ever have need to 
copy such functions into other projects (eg create_data_path(), then 
you'll want to copy the tests as well, to ensure everything works in the 
new scenario!

Accordingly, create a new .py file, with "test_" plus the name of the 
existing code-file. Inside that, import the code-file. The code the 
tests in the test file, to run each function in-turn (and several times, 
with different data), to assure the function in the code-file.
(yes, if you see virtue in this approach, there are some very helpful 
libraries to ease testing and automate it (better to have the machine do 
the work, rather than you have to (keep) doing it!), eg pytest (py.test) 
- worth some investigation time?)


What has been achieved? By the time you've tested the first few 
functions, you will know whether or not existing files are being 
(correctly) used as 'base data'. Continuing 'down', you will discover if 
new data is being correctly added/updating existing data. Further on, 
you will find if the output is being correctly written to that file.

Sooner or later, you will have answered the "where is the error?" 
problem 'here'! You will also have noticed any similar errors which may 
be 'hiding' in the code, unnoticed (so far!)...
(and all without needing to ask someone-else for help)


* not entirely true. In this manner, each function can be 
relatively-well proven with "unit tests". There is however, an 
additional source of error: do the functions 'play well together'? 
Ensuring this happens is the objective of another type of test: 
"integration testing". For example, does check_for_file() work with the 
output from the preceding stage: create_data_path()? Yes, these sorts of 
tests could be included at the bottom of the 'unit test' script, or 
written in a separate test-file...


Not forgetting, the "mainline" of the production program[me] needs to be 
written! It will call each function in-turn, feeding specific data into 
(at least) the first, eg which Excel file? Thereafter, likely feeding 
output from earlier functions into later processing. Until, 'the end'. Fin!

NB the "mainline" will be a collection and compression of the 
imperatives which could be copied from any "integration testing".

NBB what I call the "mainline" (a throw-back to my training, so long ago 
it is lost in the mists of time...) fits into a Python idiom:

if __name__ == "__main__":
     print( "\nMy wonderful program\n" )  # you'll do this in the GUI
     intermediate_result_1 = function1()
     intermediate_result_1 = function2( intermediate_result_1 )
     ...

NB it seems long-winded (and my IDE often grumbles about this too!) but 
I often prefer creating 'intermediate results' rather than writing 
'earlier' functions as parameters to 'later' functions. If any thing 
goes wrong (how could it, after all that testing?) it's easier to answer 
that famous question: "where is the error?".

That said: there are exceptions to every 'rule' - and YMMV!

-- 
Regards,
=dn


More information about the Python-list mailing list