ANN: Dogelog Runtime, Prolog to the Moon (2021)

Mostowski Collapse janburse at fastmail.fm
Mon Sep 20 08:21:54 EDT 2021


In general the algorithm I am using is from
a paper by Matts Carlson from SICStus Prolog.
Its this paper here:

Garbage Collection for Prolog Based on WAM
January 1986
Karen Appleby, Mats Carlsson, Seif Haridi, Dan Sahlin
https://www.researchgate.net/publication/279463524

But since you guys are so facinated with
the Prolog garbage collection aspect, which
is not the locus where you can do a lot

of optimizations, feel free to investigate
and come up with a solution. It will not
change the performance of Dogelog runtime,

but it could be an academic experiment
neverthless. There is nothing wrong with the
simgle linked list as it stands, since

it has O(n) sweep_trail(). It uses a litte
more storage than an array would do.

Mostowski Collapse wrote:
> What would be maybe possible, is to
> scan the trail from the other side, and
> use a first pass to determine the new
> 
> size, and use a second pass to fill a new
> array with the remaining elments. This would
> be two times O(n), so it would be linear
> 
> and not quadratic O(n^2) as when you scan
> from the top and poke holes. But then something
> else doesn't work so easily. Namely:
> 
>     def adjust_mark(temp):
>         while temp is not None:
>          if (temp.flags & MASK_VAR_MARK) != 0:
>              return temp
>          else:
>              temp = temp.tail
>      return temp
> 
> https://github.com/jburse/dogelog-moon/blob/main/devel/runtimepy/drawer/machine.py#L151 
> 
> 
> This is needed to adjust the choice points.
> If you have an array instead of a linked
> listed as I have now, you would need
> 
> to adjust array indexes instead pointers
> into linked list elements. I havent come
> up with an array solution yet for the trail,
> 
> since I dont see any utility in investing too
> much time with the Prolog garbage collection of
> Dogelog runtime. It is only a small share
> 
> of the execution time:
> 
> Mostowski Collapse schrieb am Freitag, 17. September 2021 um 10:58:57 
> UTC+2:
>  > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>  > % Standard Python Version, Warm Run
>  > % ?- time(fibo(23,X)).
>  > % % Wall 3865 ms, gc 94 ms, 71991 lips
>  > % X = 46368.
>  >
>  > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>  > % GraalVM Python Version, Warm Warm Run
>  > % ?- time(fibo(23,X)).
>  > % % Wall 695 ms, gc 14 ms, 400356 lips
>  > % X = 46368.
> 
> Mostowski Collapse wrote:
>> I read the following, and you should also know:
>>
>>> Python's [] is implemented as an array, not a linked list.
>>> Although resizing is O(n), appending to it is amortized O(1),
>>> because resizes happen very rarely.
>> https://stackoverflow.com/a/5932364/502187
>>
>> The list type doesn't have an O(1) operation to remove
>> an element during sweep. The list type, not like its name
>> would suggest, in Python is an array.
>>
>> These arrays are not so expensive when you append()
>> an element. Because they are allocated with some excess
>> capacity. And they grow exponentially.
>>
>> So amortisized you can append() a lot of elements to
>> a Python list, which is an array. But you cannot poke so
>> cheaply holes into it. So if you have this scenario:
>>
>> Before:
>>       - [ A1, .., An , B, C1, .., Cm ]
>>
>> After:
>>       - [ A1, .., An , C1, .., Cm ]
>> You have to copy C1,..,Cm one position down. On the other
>> hand, when scanning the single list, removing the
>> element is just pointer swizzling.
>>
>> The code is here, the positive if-then-else branch keeps
>> the element, the negative if-then-else branch drops the
>> element. Thats quite standard algorithm for linked lists:
>>
>>       /* pointer swizzling */
>>      while temp is not None:
>>          term = temp
>>          temp = term.tail
>>          if (term.flags & MASK_VAR_MARK) != 0:
>>              term.flags &= ~MASK_VAR_MARK
>>              if back is not None:
>>                  back.tail = term
>>              else:
>>                  trail = term
>>              back = term
>>          else:
>>              term.instantiated = NotImplemented
>>              term.tail = None
>>              count -= 1
>>
>> https://github.com/jburse/dogelog-moon/blob/main/devel/runtimepy/drawer/machine.py#L163 
>>
>>
>> There is nothing wrong with implementing a single list
>> in Python. Only you never saw that one maybe. If you would
>> indeed use Python lists which are arrays, you would
>>
>> maybe get a much slower sweep_trail() and this would
>> be seen. But currently its not seen. It happens that 1000000
>> of elements are sweeped, if you would do this with copy
>>
>> inside a Python list which are arrays, it would get much
>> more expensive, and the extremly cheap Prolog garbage
>> collection, as it stands now, wouldn't be that cheap anymore.
>>
>> You can try yourself. My sweep_trail() needs frequent resize,
>> which would be O(n) each, so that sweep_trail becomes O(n^2).
>> Which the current implementation its only O(n).
>>
>> Peter J. Holzer schrieb am Montag, 20. September 2021 um 13:49:49 UTC+2:
>>> On 2021-09-20 04:33:39 +1000, Chris Angelico wrote:
>>>> On Mon, Sep 20, 2021 at 3:19 AM Mostowski Collapse 
>>>> <janb... at fastmail.fm> wrote:
>>>>> Question to Chris Angelico: If I stay with my
>>>>> sweep_trail(), which is the periodically scanning,
>>>>> I can use a single linked list.
>>>>>
>>>>> On the other hand if I would use the trigger
>>>>> from Python, I possibly would need a double linked
>>>>> list, to remove an element.
>>>>>
>>>>> Chris Angelico, is there a third option, that I have
>>>>> overlooked? Single linked list uses less space
>>>>> than double linked list, this why I go with scan.
>>>>>
>>>>
>>>> I don't know. I don't understand your code well enough to offer advice
>>>> like that, because *your code is too complicated* and not nearly clear
>>>> enough.
>>>>
>>>> But however it is that you're doing things, the best way is almost
>>>> always to directly refer to objects. Don't fiddle around with creating
>>>> your own concept of a doubly-linked list and a set of objects; just
>>>> refer directly to the objects.
>>> And almost certainly: Just use the builtin list type if you need a list.
>>> Don't build one yourself.
>>>> Let Python be Python, don't try to build your own language on top of
>>>> it.
>>> Well, he's writing a Prolog interpreter, so building his own language on
>>> top of Python is sort of the point. I think a better way to put it is
>>> "Don't try to write Python as if it was C". A C operation may be
>>> compiled to a single machine instruction which is executed in a fraction
>>> of a nanosecond. A Python instruction (in CPython) always includes at
>>> least the interpreter overhead and often several method lookups and 
>>> method
>>> calls. You want to amortize that overhead over as much work as possible.
>>>
>>> hp
>>>
>>> -- 
>>> _ | Peter J. Holzer | Story must make more sense than reality.
>>> |_|_) | |
>>> | | | h... at hjp.at | -- Charles Stross, "Creative writing
>>> __/ | http://www.hjp.at/ | challenge!"
> 



More information about the Python-list mailing list