[Tutor] need help generating table of contents

Albert-Jan Roskam sjeik_appie at hotmail.com
Mon Aug 27 05:12:23 EDT 2018


From: Tutor <tutor-bounces+sjeik_appie=hotmail.com at python.org> on behalf of Peter Otten <__peter__ at web.de>
Sent: Friday, August 24, 2018 3:55 PM
To: tutor at python.org
<snip>
> The following reshuffle of your code seems to work:
> 
> print('\r\n** Table of contents\r\n')
> pattern = '/Title \((.+?)\).+?/Page ([0-9]+)(?:\s+/Count ([0-9]+))?'
> 
> def process(triples, limit=None, indent=0):
>     for index, (title, page, count) in enumerate(triples, 1):
>         title = indent * 4 * ' ' + title
>         print(title.ljust(79, ".") + page.zfill(2))
>         if count:
>             process(triples, limit=int(count), indent=indent+1)
>         if limit is not None and limit == index:
>             break
> 
> process(iter(re.findall(pattern, toc, re.DOTALL)))

Hi Peter, Cameron,

Thanks for your replies! The code above indeeed works as intended, but: I don't really understand *why*.
I would assign a name to the following line "if limit is not None and limit == index", what would be the most descriptive name? I often use "is_*" names for boolean variables. Would "is_deepest_nesting_level" be a good name?

Also, I don't understand why iter() is required here, and why finditer() is not an alternative.

I wrote the bookmarks file myself, and the code above is part of a shell script that compiles a large .pdf, with openoffice commandline calls, ghostscript, git, pdftk and python. The human-readable toc and the pdf bookmarks will always be consistent if I only need to edit one file.

Thanks again!

Albert-Jan


More information about the Tutor mailing list