[TriPython] A question of recursion and nested parsing

Calloway, Chris cbc at unc.edu
Fri Sep 15 10:07:44 EDT 2017


In this case, machine memory might be the problem, in which case David’s suggestion is more optimal than recursion, as it allows for data streaming.

Think SAX parser vs DOM parser. They both have their use cases.

Very much appreciating seeing an actual Python discussion on this list. (

-- 
Sincerely,
 
Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530
 

On 9/15/17, 9:38 AM, "TriZPUG on behalf of Chris Rossi" <trizpug-bounces+cbc=unc.edu at python.org on behalf of chris at archimedeanco.com> wrote:

    For grins I just did the first exercise on that link I sent, and it was
    kind of fun.  Actually doing the exercise helps significantly with
    understanding why it works.  You'll probably need to at least work up to
    Part 3, though, because you're going to need data structures for the
    parsing you're trying to do.
    
    As an aside, I tend to leave things recursive unless circumstances force my
    hand.  Premature optimization is the root of all evil, machine time is
    cheaper than programmer time, and all of that jazz.  But it is a good
    mental exercise.
    
    Chris
    
    
    On Fri, Sep 15, 2017 at 9:17 AM, David Handy <david at handysoftware.com>
    wrote:
    
    >    This is hilarious. While I was typing and editing my response beginning
    >    with "Since no one else is answering ...", three other people sent
    >    replies!
    >
    >
    >
    >    I like the link Chris Rossi sent about how to turn a recursive algorithm
    >    into an iterative one. My advice I gave using a Python list as a stack
    > is
    >    an example of the "accumulator" that the linked article talks about, for
    >    the case where a more simple accumulator such as an integer isn't enough
    >    information. An example would be if you are writing code to parse an XML
    >    document, the accumulator/stack could be the list of nested tags you
    > have
    >    encountered so far during processing.
    >
    >
    >
    >    David H
    >
    >
    >
    >
    >
    >    On Friday, September 15, 2017 8:57am, "Chris Rossi"
    >    <chris at archimedeanco.com> said:
    >
    >    > _______________________________________________
    >    > TriZPUG mailing list
    >    > TriZPUG at python.org
    >    > https://mail.python.org/mailman/listinfo/trizpug
    >    > http://tripython.org is the Triangle Python Users Group
    >    > http://blog.moertel.com/posts/2013-05-11-recursive-to-iterative.html
    >    >
    >    > On Fri, Sep 15, 2017 at 8:56 AM, Will Spearman <
    > willspearman at gmail.com>
    >    > wrote:
    >    >
    >    > > Sharing it as a github gist is helpful. Try this:
    >    > > [1]https://gist.github.com/willspearman/
    > 97d9ee58b9391b0cdcd09d601017a2
    >    > > f8
    >    > > On Fri, Sep 15, 2017 at 8:48 AM Bob Gailer <[2]bgailer at gmail.com>
    >    > > wrote:
    >    > >
    >    > > ** **On Sep 14, 2017 9:33 AM, "Ken MacKenzie"
    >    > <[1][3]ken at mack-z.com>
    >    > > wrote:
    >    > > ** **>
    >    > > ** **> ** **So I have a bit of code I have created that works to
    >    > solve
    >    > > the
    >    > > ** **problem,
    >    > > ** **> ** **however as an academic exercise, I realized I could not
    >    > > reason of a
    >    > > ** **way to
    >    > > ** **> ** **solve it without recursion.
    >    > >
    >    > > ** **My understanding of computer science says that recursion is
    > never
    >    > > ** **necessary. There is always a way to solve the problem without
    >    > > recursion.
    >    > > ** **> ** **The situation, I have a file with many possible
    >    > delimiters
    >    > > and
    >    > > ** **those
    >    > > ** **> ** **delimiters have levels so to speak so as I am kind of
    >    > > parsing a
    >    > > ** **tree, and
    >    > > ** **> ** **in this case a tree that is lists of lists of lists of x
    >    > > n...
    >    > > ** **> ** **Excuse the lack of real commenting in this sample.** But
    >    > > curious
    >    > > ** **what
    >    > > ** **> ** **other ways I could solve it, or if there are ways I
    >    > could
    >    > > perhaps
    >    > > ** **make it
    >    > > ** **> ** **perform faster.** It performs fast enough but I realize
    >    > > the
    >    > > file
    >    > > ** **could get
    >    > > ** **> ** **quite long.
    >    > > ** **It would be very helpful if you would give us a sample input
    > and
    >    > > the
    >    > > ** **corresponding output. Your description above does not help.
    >    > > ** **Also your code has lots of** in it that I presume are either
    >    > > spaces
    >    > > or
    >    > > ** **tabs.
    >    > > ** **I would appreciate it if you would post the code that is
    > actually
    >    > > ** **executable.
    >    > > ** **My guess is that there's a much simpler way to do this.
    >    > > ** **> ** **=======
    >    > > ** **> ** **parse_1 = ["\n", "\x00", "\x01", "\x02", "\x03", "\x04",
    >    > > "\x05",
    >    > > ** **"\x06",
    >    > > ** **> ** **"\x07"]
    >    > > ** **> ** **parse_2 = [chr(255), chr(254), chr(253), chr(252)]
    >    > > ** **> ** **all_parsers = parse_1 + parse_2
    >    > > ** **> ** **def recursive_parse(in_var, parse_char):
    >    > > ** **> ** **** ** if isinstance(in_var, list):
    >    > > ** **> ** **** ** ** ** return [recursive_parse(x, parse_char) for x
    >    > > in
    >    > > in_var]
    >    > > ** **> ** **** ** elif isinstance(in_var, str):
    >    > > ** **> ** **** ** ** ** return in_var.split(parse_char)
    >    > > ** **> ** **def nested_parse(str_in):
    >    > > ** **> ** **** ** big_list = []
    >    > > ** **> ** **** ** temp = []
    >    > > ** **> ** **** ** parse_list = all_parsers
    >    > > ** **> ** **** ** max_lvl = len(parse_list) - 1
    >    > > ** **> ** **** ** for lvl, psr_c in enumerate(parse_list):
    >    > > ** **> ** **** ** ** ** # print("temp:", temp[0][:20])
    >    > > ** **> ** **** ** ** ** if lvl == 0:
    >    > > ** **> ** **** ** ** ** ** ** temp.append(str_in.split(psr_c))
    >    > > ** **> ** **** ** ** ** else:
    >    > > ** **> ** **** ** ** ** ** ** pre_lvl = lvl - 1
    >    > > ** **> ** **** ** ** ** ** ** new_lvl =
    >    > recursive_parse(temp[pre_lvl],
    >    > > psr_c)
    >    > > ** **> ** **** ** ** ** ** ** temp.append(new_lvl)
    >    > > ** **> ** **** ** big_list = temp[max_lvl]
    >    > > ** **> ** **** ** return big_list
    >    > > ** **>
    >    > > ** **> _______________________________________________
    >    > > ** **> TriZPUG mailing list
    >    > > ** **> [2][4]TriZPUG at python.org
    >    > > ** **> [3][5]https://mail.python.org/mailman/listinfo/trizpug
    >    > > ** **> [4][6]http://tripython.org is the Triangle Python Users
    > Group
    >    > > ** **>
    >    > >
    >    > > References
    >    > >
    >    > > ** **Visible links
    >    > > ** **1. mailto:[7]ken at mack-z.com
    >    > > ** **2. mailto:[8]TriZPUG at python.org
    >    > > ** **3. [9]https://mail.python.org/mailman/listinfo/trizpug
    >    > > ** **4. [10]http://tripython.org/
    >    > > _______________________________________________
    >    > > TriZPUG mailing list
    >    > > [11]TriZPUG at python.org
    >    > > [12]https://mail.python.org/mailman/listinfo/trizpug
    >    > > [13]http://tripython.org is the Triangle Python Users Group
    >    > >
    >    > > References
    >    > >
    >    > > Visible links
    >    > > 1. https://gist.github.com/willspearman/
    > 97d9ee58b9391b0cdcd09d601017a2
    >    > > f8
    >    > > 2. mailto:bgailer at gmail.com
    >    > > 3. mailto:ken at mack-z.com
    >    > > 4. mailto:TriZPUG at python.org
    >    > > 5. https://mail.python.org/mailman/listinfo/trizpug
    >    > > 6. http://tripython.org/
    >    > > 7. mailto:ken at mack-z.com
    >    > > 8. mailto:TriZPUG at python.org
    >    > > 9. https://mail.python.org/mailman/listinfo/trizpug
    >    > > 10. http://tripython.org/
    >    > > 11. mailto:TriZPUG at python.org
    >    > > 12. https://mail.python.org/mailman/listinfo/trizpug
    >    > > 13. http://tripython.org/
    >    > >
    >    > > _______________________________________________
    >    > > TriZPUG mailing list
    >    > > TriZPUG at python.org
    >    > > https://mail.python.org/mailman/listinfo/trizpug
    >    > > http://tripython.org is the Triangle Python Users Group
    >    > >
    >    > >
    >    > [1]http://blog.moertel.com/posts/2013-05-11-recursive-to-
    > iterative.html
    >    > On Fri, Sep 15, 2017 at 8:56 AM, Will Spearman
    >    > <[2]willspearman at gmail.com>
    >    > wrote:
    >    >
    >    > ** **Sharing it as a github gist is helpful. Try this:
    >    > **
    >    >
    >    **[1][3]https://gist.github.com/willspearman/
    > 97d9ee58b9391b0cdcd09d601017a2f8
    >    > ** **On Fri, Sep 15, 2017 at 8:48 AM Bob Gailer
    >    > <[2][4]bgailer at gmail.com> wrote:
    >    >
    >    > ** ** **** **On Sep 14, 2017 9:33 AM, "Ken MacKenzie"
    >    > <[1][3][5]ken at mack-z.com>
    >    > ** ** **wrote:
    >    > ** ** **** **>
    >    > ** ** **** **> ** **So I have a bit of code I have created that works
    > to
    >    > solve
    >    > ** ** **the
    >    > ** ** **** **problem,
    >    > ** ** **** **> ** **however as an academic exercise, I realized I
    > could
    >    > not
    >    > ** ** **reason of a
    >    > ** ** **** **way to
    >    > ** ** **** **> ** **solve it without recursion.
    >    >
    >    > ** ** **** **My understanding of computer science says that recursion
    > is
    >    > never
    >    > ** ** **** **necessary. There is always a way to solve the problem
    >    > without
    >    > ** ** **recursion.
    >    > ** ** **** **> ** **The situation, I have a file with many possible
    >    > delimiters
    >    > ** ** **and
    >    > ** ** **** **those
    >    > ** ** **** **> ** **delimiters have levels so to speak so as I am kind
    >    > of
    >    > ** ** **parsing a
    >    > ** ** **** **tree, and
    >    > ** ** **** **> ** **in this case a tree that is lists of lists of
    > lists
    >    > of x
    >    > ** ** **n...
    >    > ** ** **** **> ** **Excuse the lack of real commenting in this
    > sample.**
    >    > But
    >    > ** ** **curious
    >    > ** ** **** **what
    >    > ** ** **** **> ** **other ways I could solve it, or if there are ways
    > I
    >    > could
    >    > ** ** **perhaps
    >    > ** ** **** **make it
    >    > ** ** **** **> ** **perform faster.** It performs fast enough but I
    >    > realize the
    >    > ** ** **file
    >    > ** ** **** **could get
    >    > ** ** **** **> ** **quite long.
    >    > ** ** **** **It would be very helpful if you would give us a sample
    >    > input and
    >    > ** ** **the
    >    > ** ** **** **corresponding output. Your description above does not
    > help.
    >    > ** ** **** **Also your code has lots of** in it that I presume are
    >    > either spaces
    >    > ** ** **or
    >    > ** ** **** **tabs.
    >    > ** ** **** **I would appreciate it if you would post the code that is
    >    > actually
    >    > ** ** **** **executable.
    >    > ** ** **** **My guess is that there's a much simpler way to do this.
    >    > ** ** **** **> ** **=======
    >    > ** ** **** **> ** **parse_1 = ["\n", "\x00", "\x01", "\x02", "\x03",
    >    > "\x04",
    >    > ** ** **"\x05",
    >    > ** ** **** **"\x06",
    >    > ** ** **** **> ** **"\x07"]
    >    > ** ** **** **> ** **parse_2 = [chr(255), chr(254), chr(253), chr(252)]
    >    > ** ** **** **> ** **all_parsers = parse_1 + parse_2
    >    > ** ** **** **> ** **def recursive_parse(in_var, parse_char):
    >    > ** ** **** **> ** **** ** if isinstance(in_var, list):
    >    > ** ** **** **> ** **** ** ** ** return [recursive_parse(x, parse_char)
    >    > for x in
    >    > ** ** **in_var]
    >    > ** ** **** **> ** **** ** elif isinstance(in_var, str):
    >    > ** ** **** **> ** **** ** ** ** return in_var.split(parse_char)
    >    > ** ** **** **> ** **def nested_parse(str_in):
    >    > ** ** **** **> ** **** ** big_list = []
    >    > ** ** **** **> ** **** ** temp = []
    >    > ** ** **** **> ** **** ** parse_list = all_parsers
    >    > ** ** **** **> ** **** ** max_lvl = len(parse_list) - 1
    >    > ** ** **** **> ** **** ** for lvl, psr_c in enumerate(parse_list):
    >    > ** ** **** **> ** **** ** ** ** # print("temp:", temp[0][:20])
    >    > ** ** **** **> ** **** ** ** ** if lvl == 0:
    >    > ** ** **** **> ** **** ** ** ** ** ** temp.append(str_in.split(psr_
    > c))
    >    > ** ** **** **> ** **** ** ** ** else:
    >    > ** ** **** **> ** **** ** ** ** ** ** pre_lvl = lvl - 1
    >    > ** ** **** **> ** **** ** ** ** ** ** new_lvl =
    >    > recursive_parse(temp[pre_lvl],
    >    > ** ** **psr_c)
    >    > ** ** **** **> ** **** ** ** ** ** ** temp.append(new_lvl)
    >    > ** ** **** **> ** **** ** big_list = temp[max_lvl]
    >    > ** ** **** **> ** **** ** return big_list
    >    > ** ** **** **>
    >    > ** ** **** **> _______________________________________________
    >    > ** ** **** **> TriZPUG mailing list
    >    > ** ** **** **> [2][4][6]TriZPUG at python.org
    >    > ** ** **** **> [3][5][7]https://mail.python.
    > org/mailman/listinfo/trizpug
    >    > ** ** **** **> [4][6][8]http://tripython.org is the Triangle Python
    >    > Users Group
    >    > ** ** **** **>
    >    >
    >    > ** ** **References
    >    >
    >    > ** ** **** **Visible links
    >    > ** ** **** **1. mailto:[7][9]ken at mack-z.com
    >    > ** ** **** **2. mailto:[8][10]TriZPUG at python.org
    >    > ** ** **** **3. [9][11]https://mail.python.
    > org/mailman/listinfo/trizpug
    >    > ** ** **** **4. [10][12]http://tripython.org/
    >    > ** ** **_______________________________________________
    >    > ** ** **TriZPUG mailing list
    >    > ** ** **[11][13]TriZPUG at python.org
    >    > ** ** **[12][14]https://mail.python.org/mailman/listinfo/trizpug
    >    > ** ** **[13][15]http://tripython.org is the Triangle Python Users
    > Group
    >    >
    >    > References
    >    >
    >    > ** **Visible links
    >    > ** **1.
    >    >
    >    [16]https://gist.github.com/willspearman/97d9ee58b9391b0cdcd09d601017a2
    > f8
    >    > ** **2. mailto:[17]bgailer at gmail.com
    >    > ** **3. mailto:[18]ken at mack-z.com
    >    > ** **4. mailto:[19]TriZPUG at python.org
    >    > ** **5. [20]https://mail.python.org/mailman/listinfo/trizpug
    >    > ** **6. [21]http://tripython.org/
    >    > ** **7. mailto:[22]ken at mack-z.com
    >    > ** **8. mailto:[23]TriZPUG at python.org
    >    > ** **9. [24]https://mail.python.org/mailman/listinfo/trizpug
    >    > ** 10. [25]http://tripython.org/
    >    > ** 11. mailto:[26]TriZPUG at python.org
    >    > ** 12. [27]https://mail.python.org/mailman/listinfo/trizpug
    >    > ** 13. [28]http://tripython.org/
    >    >
    >    > _______________________________________________
    >    > TriZPUG mailing list
    >    > [29]TriZPUG at python.org
    >    > [30]https://mail.python.org/mailman/listinfo/trizpug
    >    > [31]http://tripython.org is the Triangle Python Users Group
    >    >
    >    > References
    >    >
    >    > Visible links
    >    > 1. http://blog.moertel.com/posts/2013-05-11-recursive-to-
    > iterative.html
    >    > 2. mailto:willspearman at gmail.com
    >    > 3. https://gist.github.com/willspearman/
    > 97d9ee58b9391b0cdcd09d601017a2f8
    >    > 4. mailto:bgailer at gmail.com
    >    > 5. mailto:ken at mack-z.com
    >    > 6. mailto:TriZPUG at python.org
    >    > 7. https://mail.python.org/mailman/listinfo/trizpug
    >    > 8. http://tripython.org/
    >    > 9. mailto:ken at mack-z.com
    >    > 10. mailto:TriZPUG at python.org
    >    > 11. https://mail.python.org/mailman/listinfo/trizpug
    >    > 12. http://tripython.org/
    >    > 13. mailto:TriZPUG at python.org
    >    > 14. https://mail.python.org/mailman/listinfo/trizpug
    >    > 15. http://tripython.org/
    >    > 16.
    >    https://gist.github.com/willspearman/97d9ee58b9391b0cdcd09d601017a2f8
    >    > 17. mailto:bgailer at gmail.com
    >    > 18. mailto:ken at mack-z.com
    >    > 19. mailto:TriZPUG at python.org
    >    > 20. https://mail.python.org/mailman/listinfo/trizpug
    >    > 21. http://tripython.org/
    >    > 22. mailto:ken at mack-z.com
    >    > 23. mailto:TriZPUG at python.org
    >    > 24. https://mail.python.org/mailman/listinfo/trizpug
    >    > 25. http://tripython.org/
    >    > 26. mailto:TriZPUG at python.org
    >    > 27. https://mail.python.org/mailman/listinfo/trizpug
    >    > 28. http://tripython.org/
    >    > 29. mailto:TriZPUG at python.org
    >    > 30. https://mail.python.org/mailman/listinfo/trizpug
    >    > 31. http://tripython.org/
    >    >
    >
    > _______________________________________________
    > TriZPUG mailing list
    > TriZPUG at python.org
    > https://mail.python.org/mailman/listinfo/trizpug
    > http://tripython.org is the Triangle Python Users Group
    >
    >
    



More information about the TriZPUG mailing list