ANN: Dogelog Runtime, Prolog to the Moon (2021)

Chris Angelico rosuav at gmail.com
Mon Sep 20 14:09:22 EDT 2021


On Tue, Sep 21, 2021 at 3:58 AM Mostowski Collapse <bursejan at gmail.com> wrote:
>
> I read the following, and you should also know:
>
> > Python's [] is implemented as an array, not a linked list.
> > Although resizing is O(n), appending to it is amortized O(1),
> > because resizes happen very rarely.
> https://stackoverflow.com/a/5932364/502187
>
> The list type doesn't have an O(1) operation to remove
> an element during sweep. The list type, not like its name
> would suggest, in Python is an array.
>
> These arrays are not so expensive when you append()
> an element. Because they are allocated with some excess
> capacity. And they grow exponentially.
>
> So amortisized you can append() a lot of elements to
> a Python list, which is an array. But you cannot poke so
> cheaply holes into it. So if you have this scenario:
>
> Before:
>      - [ A1, .., An , B, C1, .., Cm ]
>
> After:
>      - [ A1, .., An , C1, .., Cm ]
>
> You have to copy C1,..,Cm one position down. On the other
> hand, when scanning the single list, removing the
> element is just pointer swizzling.
>
> The code is here, the positive if-then-else branch keeps
> the element, the negative if-then-else branch drops the
> element. Thats quite standard algorithm for linked lists:
>
>      /* pointer swizzling */
>     while temp is not None:
>         term = temp
>         temp = term.tail
>         if (term.flags & MASK_VAR_MARK) != 0:
>             term.flags &= ~MASK_VAR_MARK
>             if back is not None:
>                 back.tail = term
>             else:
>                 trail = term
>             back = term
>         else:
>             term.instantiated = NotImplemented
>             term.tail = None
>             count -= 1
>
> https://github.com/jburse/dogelog-moon/blob/main/devel/runtimepy/drawer/machine.py#L163
>
> There is nothing wrong with implementing a single list
> in Python. Only you never saw that one maybe. If you would
> indeed use Python lists which are arrays, you would
> maybe get a much slower sweep_trail() and this would
> be seen. But currently its not seen. It happens that 1000000
> of elements are sweeped, if you would do this with copy
> inside a Python list which are arrays, it would get much
> more expensive, and the extremly cheap Prolog garbage
> collection, as it stands now, wouldn't be that cheap anymore.
>
> You can try yourself. My sweep_trail() needs frequent resize,
> which would be O(n) each, so that sweep_trail becomes O(n^2).
> Which the current implementation its only O(n).
>

How about, instead: Use a Python list, and instead of removing
individual items one by one, filter out the ones you don't want, using
a list comprehension? That would be O(n) to completely remove all the
ones you don't want, instead of O(n) for each individual removal.

Also, have you actually benchmarked a version that uses Python's
lists, or are you simply assuming that the removals will be slow?
Implementing your own singly-linked list was clearly suboptimal, but
have you tried writing simpler code and seeing if it is also faster?

ChrisA


More information about the Python-list mailing list