[Cython] Type Inference: Inter Procedural Analysis

Mon Jan 8 01:21:34 EST 2018

Hey!

Thank you for the reply.

Interesting. Could you elaborate on what you found missing or badly
designed? Would be interesting to know for us.

I'll elaborate a bit on what I had in mind while implementing inter
procedural analysis, and what I found to be a bit inflexible. Also, I am
attaching a progress report I wrote as part of the coursework, which
elaborates my overall strategy in a bit more depth.

Right now, I'm storing incoming and outgoing callsites of a function at the
function scope level. I think it makes more sense to store this information
at the AST node, however, that results in some limitations later on during
the Type Inference stage, as only information about the scope is passed to
the Inference System. Furthermore, I think I needed to add a transform
between the infer_types and the analyse_expressions stage, which are done
in a single Transform, but I think that's because my way of doing things
was hackish, and could be done in a lot better way.

After storing the callsites information at the scope level, I made a
separate Inter Procedural Inferer, that handles recursive functions
separately from non recursive functions. The non recursive case is handled
by traversing the call graph to the first function with no incoming nodes
(I have still not handled cycles in the graph, except recursion), and
starting type inference from there, and traversing down the graph until all
the descendants of this ancestor function have been inferred. The recursive
case is handled a bit separately, where I take all the return statements in
the function before the first recursive call, infer their type if it's
consistent, and then re-run type inference on the whole function. If the
result is consistent, i.e. all the return nodes have the same type, then
the type is inferred. Otherwise, the code falls back to PyObjects.

My current code breaks the compiler in certain cases, and I'm working on
fixing that. Furthermore, I think I'll need to mark return statements in
the FlowControl stage too in addition to the assignment statements, as they
can contain expressions too which I'll need to infer.

My overall plan had been to change as little of the core code as possible
to get a solution up and running, so as not to break anything. I think I
have commented out about 4,5 lines in the original repo only.

1) The functionality looks really nice. Since you weren't accustomed with
the code base before, it's understandable that things aren't perfectly
integrated with the existing architecture. That can be cleaned up.

2) I was surprised to see that you didn't git-clone the existing repository
but created a new one from a source copy instead. But that's probably ok
for getting started because (I think) you wrote the code experimentally and
didn't focus that much on ready-to-merge commits anyway. Also, you
accidentally added .pyc and .so files. Those shouldn't be under version
control. It would probably be best to start over from a fresh clone and
apply your changes as patch.

3) The commits are a bit difficult to follow because the commit comments
are essentially free of information. It would help if you had used them to
explain the steps you took and what your intentions were.

That's the repository I was working on. However, the code base is pretty
hacky now, and the commits aren't really consistent with their
descriptions. I was just developing experimentally, as I was not really
familiar with the code base. I'll fork the repo, and make some commits with
comments, and clean up the code. Once I've done that, and my code is a bit
more understandable, I'll

4) Is there a reason why you didn't merge the Graph building with the
control flow analysis in FlowControl.py?

My overall strategy was to change/edit as little of the core code as
possible. I'll merge it in FlowControl in the updated commits

5) I can't see any test code, but since you implemented this in multiple
iterations, I'm sure that you had test code on your side that you tried to
compile. Could you add some examples that show how this change improves
things? There are hints on writing tests in the hacker guide:

I'll add some of the test files I used locally in the tests in the new
commits.

Lastly, the Type Inference System I implemented is pretty simple, and might
have limitations that I'm not aware of. However, I would love to work on
this further, to fix/improve on the whole system. I do not know about
Incremental Type Inference, but if that's the way to go and my above
strategy is overly simplistic, I'll be happy to work on that too.

Usama

On Tue, Jan 2, 2018 at 8:23 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:

> Hi!
>
> Thanks for working on this!
>
> usama hameed schrieb am 29.12.2017 um 19:31:
> > I recently suggested implementing Inter-Procedural Analysis to infer
> > function types and made the following Github issue
> > <https://github.com/cython/cython/issues/1893>, and I was advised to
> > communicate on this channel.
> >
> > I went through the code base, and have implemented a rudimentary type
> > inference system with inter procedural analysis of function types and
> > arguments, and have handled recursive cases. However, the code base needs
> > to be cleaned up a lot and is quite buggy right now. Furthermore, I am
> > pretty sure a lot of edge cases need to be handled, i.e. closures etc.
>
> I guess you are referring to this repository:
>
> https://github.com/usama54321/Cython/commits/master
>
>
> > The reason I am sending out this email is to get some suggestions. Right
> > now, the code I have written is pretty hacky, since the current code base
> > of the project does not accommodate much flexibility to perform inter
> > procedural analysis.
>
> Interesting. Could you elaborate on what you found missing or badly
> designed? Would be interesting to know for us.
>
> Here are a couple of comments on your changes:
>
> 1) The functionality looks really nice. Since you weren't accustomed with
> the code base before, it's understandable that things aren't perfectly
> integrated with the existing architecture. That can be cleaned up.
>
> 2) I was surprised to see that you didn't git-clone the existing repository
> but created a new one from a source copy instead. But that's probably ok
> for getting started because (I think) you wrote the code experimentally and
> didn't focus that much on ready-to-merge commits anyway. Also, you
> accidentally added .pyc and .so files. Those shouldn't be under version
> control. It would probably be best to start over from a fresh clone and
> apply your changes as patch.
>
> 3) The commits are a bit difficult to follow because the commit comments
> are essentially free of information. It would help if you had used them to
> explain the steps you took and what your intentions were.
>
> 4) Is there a reason why you didn't merge the Graph building with the
> control flow analysis in FlowControl.py?
>
> 5) I can't see any test code, but since you implemented this in multiple
> iterations, I'm sure that you had test code on your side that you tried to
> compile. Could you add some examples that show how this change improves
> things? There are hints on writing tests in the hacker guide:
>
> https://github.com/cython/cython/wiki/HackerGuide#getting-started
>
> Specifically, look at the "*infer*.pyx" file tests in tests/run/. I think
> it would be best to add a new one.
>
>
> > I found an enhancement suggestion
> > <https://github.com/cython/cython/wiki/enhancements-typeinference> on
> the
> > GitHub project, and was wondering whether this should be done first in
> > order to make a more flexible type inference system before trying to
> > properly implement inter procedural analysis into the project.
>
> Type inference was implemented long ago and has been improved a couple of
> times since then. It's not perfect, but it's actually quite good and can
> further be improved in gradual steps. Inter-procedural analysis seems like
> one such improvement.
>
>
> > I just started on this as part of a university course project, but I want
> > to continue working on this. I am not really familiar with the project's
> > development ecosystem, and it would be really helpful if I'm given some
> > guidance.
>
> It would certainly be great to have this feature added. Could you explain
> some of your design decisions? That would help me understand what you did
> and why, so that I can start giving advice on where to go from here.
>
> Generally speaking, I think it would be good if we could make it reuse more
> of the existing infrastructure for type inference and control flow
> analysis. I would only want to diverge from those if you could convince me
> that this feature is fundamentally independent from what's there, but that
> would surprise me. Correct me if I'm wrong, but what I would expect is
> basically an incremental type inference that (partially) re-infers
> functions when their dependencies (i.e. the functions that they call)
> change their return type. Was the incremental part here something that you
> found difficult?
>
> Stefan
> _______________________________________________
> cython-devel mailing list
> cython-devel at python.org
> https://mail.python.org/mailman/listinfo/cython-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20180108/faa9c80d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: report2.odt
Type: application/vnd.oasis.opendocument.text
Size: 42030 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/cython-devel/attachments/20180108/faa9c80d/attachment-0001.odt>