[Tutor] text processing lines variable content
Mark Lawrence
breamoreboy at gmail.com
Wed Feb 6 13:07:10 EST 2019
On 06/02/2019 16:33, ingo janssen wrote:
> For parsing the out put of the Voro++ program and writing the data to a
> POV-Ray include file I created a bunch of functions.
>
> def pop_left_slice(inputlist, length):
> outputlist = inputlist[0:length]
> del inputlist[:length]
> return outputlist
That's going to a lot of work slicing and dicing the input lists.
Perhaps a chunked recipe like this
https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.chunked
would be better.
>
> this is used by every function to chop of the required part of the input
> line.
> Two examples of the functions that proces a chopped of slice of the line
> and append the data to the approriate list.
>
> def f_vector(outlist):
> x,y,z = pop_left_slice(line,3)
> outlist.append(f"<{x},{y},{z}>,")
>
> def f_vector_array(outlist, length):
> rv = pop_left_slice(line, length)
> rv = [f'<{i[1:-1]}>' for i in rv] #i format is: '(1.234,2.345,3.456)'
> rv = ",".join(rv)
> outlist.append(f" //label: {lbl}\n array[{length}]"+"{\n "+rv+"\n
> }\n")
>
> Every line can contain up to 21 data chunks. Within one file each line
> contains the same amount of chunks, but it varies between files. The
> types of chunks vary and their position varies. I know beforehand how a
> line in a file is constructed. I'd like to adapt the order in that the
> functions are applied, but how?
I suspect that you're trying to over complicate things, what's wrong
with a simple if/elif chain, a switch based on a dict or similar?
>
> for i, line in enumerate(open("vorodat.vol",'r')):
> points = i+1
enumerate takes a start argument so you shouldn't need the above line.
> line = line.strip()
> line = line.split(" ")
> lbl = f_label(label)
> f_vector(point)
Presumably the above is points?
> f_value(radius)
> v=f_number(num_vertex)
> f_vector_array(rel_vertex,v)
> f_vector_array(glob_vertex,v)
> f_value_array(vertex_orders,v)
> f_value(max_radius)
> e=f_number(num_edge)
> f_value(edge_dist)
> ...etc
>
> I thought about putting the functions in a dict and then create a list
> with the proper order, but can't get it to work.
Please show us your code and exactly why it didn't work.
>
> A second question, all this works for small files with hundreds of
> lines, but some have 100000. Then I can get at max 22 lists with 100000
> items. Not fun. I tried writing the data to a file "out of sequence",
> not fun either. What would be the way to do this?
> I thought about writing each data chunk to a proper temporary file
> instead of putting it in a list first. This would require at max 22 temp
> files and then a merge of the files into one.
I'm not absolutely sure what you're saying here, but would something
like the SortedList from
http://www.grantjenks.com/docs/sortedcontainers/ help?
>
> TIA,
>
> ingo
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
More information about the Tutor
mailing list