[Tutor] Python text file read/compare

Alan Gauld alan.gauld at yahoo.co.uk
Tue Oct 4 04:40:34 EDT 2022


On 03/10/2022 21:14, samira khoda wrote:

> Thank you very much for your feedback.I just started working on the file
> again. Unfortunately I am still not getting anywhere with the modifications
> I made to the codes.  I don't know when I can create the new file and write
> to it and close it. 

Can you write a program to simply split your input file(s) after every N
lines? ie something like:

open the input
open output
linecount = 0
for line in input
    linecount += 1
    if linecount > N
       close output
       open new output
       linecount = 1
    write line to output

If you can get that to work then you can replace the line

if linecount > N

with

if zerotimestamp(line)

And write a function to determine if the timestamp
indicates a new file is needed.

But get the basic structure in place first, don't
try to solve both problems at once.

> Basically as I mentioned before I need to find where
> the timestamps jump down to zero or the lowest number then start time so I
> can split the file and make a new file.***For your information the
> timestamps are in milliseconds***

I think you will need to explain how that works. Your sample
data below is not clear. You need to show us a sample where the
timestamp is not zero then one that is. And explain which field
defines that.


> 
> kyk=( 'hfx-test-.txt'  , 'r')

Not that kyk is just a tuple of 2 strings.


> 
> line_numbers=[]
> i=0
> current_time=""
> count=1
> 
> for line in kyk:

So this will return the 2 strings 'hfx-test-.txt' and 'r'

>     if i < 488:
>         i=0
>         line_numbers.append(current_time)
>         current_time=""
>     else:
>         current_time+=line
>         i+=1

This doesn't make much sense to me. Where does the magic
number 488 fit in? First time through the loop i will
always be zero so you will always start the list with
an empty string in line_numbers. (BTW line_numbers is a
very misleading name since it appears to be a list of times?)

And because i is always 0 you never go into the else part
so it never gets increased. So the for loop executes twice
storing 2 empty strings in the line_numbers list.

Even if you opened the file and iterated over it you would
still just get a blank string for each line in the file.

> **************this is where It does not write to the file*****
> 
> *opf=open("kyk_txt_new_file.txt","w")*
> 
> *    if current_time <start_time:*
> 
> *        splitlines(True).add(line_number);*
> 
> *    opf.write("this is the timestamps after the line splits.")*
> 
> *    kyk.close()*


And this makes even less sense.
where is start_time coming from?
what is splitlines()? Where is it defined?
And the only thing you write to the file is the
message, you never write any data?

And you close kyk but not opf?

> *Here are the first and last few lines of my text file data.  *
> 
> time stamps
> -1.75, 1.08, 10.35, -0.10, -0.01, -0.01, 23.19, *488*
> -1.75, 1.12, 10.39, -0.10, -0.01, -0.01, 23.20, *521*
> 
> 9.65, -1.31, -1.95, -0.11, -0.06, -0.02, 22.05, *15339436*
> 9.56, -1.32, -1.97, -0.10, -0.00, -0.01, 22.05, *15339495*

Does a zero timestamp appear in any of those lines?
This is just a set of numbers. Are they all timestamps?
If so, why is the last number much bigger than the rest?
And is the 488 at the end of line 1 significant in being
the same as the magic number in your code?


> 
> *I was also provided with the * pseudocode * below which I am trying to
> follow if that helps to guide me along the way.*

The pseudo code makes some sense - although a for loop
would be simpler than the while. And it does not appear
to be doing what you describe as the required task.

But your code is not even close to what it does.

> -> load sourceFile (a copy of the raw data file)

I think that should say open sourcefile, not load...

> line_number = 0 
> start_time = 0
> split_numbers = []
> not_done = true
> 
> while(not_done):
>        ->read line from sourcefile
>        ->split line on ','
>        ->convert last item on line to unsigned long and store in
> current_time
> 
>        if current_time < start_time:
>               split_numbers.add(line_number)
>               start_time = current_time
>        if end_of_file:
>               not_done = false;
> 
> for s in split_numbers:
>        ->create newfile
>        for i = 0, i < s, i++:
>               ->read line from sourcefile
>               ->write line to newfile
> 
>        ->close newfile
> 
> ->Close sourcefile

-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos






More information about the Tutor mailing list