[Numpy-discussion] ENH: Proposal to add atleast_nd function

Oscar Benjamin oscar.j.benjamin at gmail.com
Thu Feb 18 09:39:16 EST 2021


On Thu, 18 Feb 2021 at 10:11, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Wed, Feb 17, 2021 at 9:26 PM Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
>>
>> On Wed, 17 Feb 2021 at 10:36, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>> >
>> > On Wed, Feb 17, 2021 at 12:26 AM Stefan van der Walt <stefanv at berkeley.edu> wrote:
>> >>
>> >> Ralf has been working towards this idea, but having a well-organised namespace of utility functions outside of the core NumPy API would be helpful in allowing expansion and experimentation, without making the current situation worse (where we effectively have to support things forever).  As an example, take Cartesian product [0] and array combinations [1], which have been requested several times on StackOverflow, but there's nowhere to put them.
>> >
>> > This is a good point. If we could put it in `numpy.lib` without it bleeding into the main namespace, saying yes here would be easier. Maybe we can give it a conditional yes based on that namespace reorganization?
>>
>> As an aside is this numpy.lib idea explained anywhere?
>
>
> It isn't, but it's relatively straightforward and can be done without thinking about the issues around our other namespaces. Basically:
> - today `numpy.lib` is a public but fairly useless namespace, because its contents get star-imported to the main namespace; only subsubmodules like `numpy.lib.stride_tricks` are separate

Okay, that's a bit different from what I was thinking of for sympy.
The problem for sympy is that everything is either in the top-level
sympy namespace or is just directly imported from the module where it
is defined. That means that there is no proper separation between
public and private apart from being in the top-level namespace which
is already bloated on the one hand and incomplete on the other since
we obviously can't put *everything* there.

Even something is simple as deleting a no longer needed internal
function or renaming an "internal" module is potentially problematic
in sympy. I was thinking about having a sympy.public module (and
submodules) and documenting that as the expected public interface for
importing *anything* from sympy. Potentially that could be called
sympy.lib which would seem consistent with numpy although having the
same name could be problematic if the intent is not necessarily the
same.

> - we want to stop this star-importing, which required some tedious work of fixing up how we handle __all__ dicts in addition to making exports explicit
> - then, we would like to use `numpy.lib` as a namespace for utilities and assorted functionality that people seem to want, but does not meet the bar for the main namespace and doesn't fit in our other decent namespace (fft, linalg, random, polynomial, f2py, distutils).
> - TBD if there should be subsubmodules under `numpy.lib` or not
> - it should be explicitly documented that this is a "lower bar namespace" and that we discourage other array/tensor libraries from copying its API
>
> We had a good discussion about this in the community meeting yesterday. Sebastian volunteered to sort out the star-import issue.

I already removed all the star-imports from sympy which was somewhat tedious.

Sebastian you might be interested in the script I wrote below. It
extracts all of the star-imported names from a module and formats the
__all__ and import lines for the __init__.py file. I used it to create
e.g. this:
https://github.com/sympy/sympy/blob/master/sympy/__init__.py#L51-L491
I think that flake8 spots if the import list and the __all__ get out
of sync so it's not so hard to maintain later on.

You just tell the script what package the __init__.py is and what
submodules to import like:

$ my/fmt_imports.py numpy.lib type_check index_tricks
__all__ = [
    'iscomplexobj', 'isrealobj', 'imag', 'iscomplex', 'isreal', 'nan_to_num',
    'real', 'real_if_close', 'typename', 'asfarray', 'mintypecode',
    'asscalar', 'common_type',

    'ravel_multi_index', 'unravel_index', 'mgrid', 'ogrid', 'r_', 'c_', 's_',
    'index_exp', 'ix_', 'ndenumerate', 'ndindex', 'fill_diagonal',
    'diag_indices', 'diag_indices_from',
]
from .type_check import (iscomplexobj, isrealobj, imag, iscomplex, isreal,
        nan_to_num, real, real_if_close, typename, asfarray, mintypecode,
        asscalar, common_type)

from .index_tricks import (ravel_multi_index, unravel_index, mgrid, ogrid, r_,
        c_, s_, index_exp, ix_, ndenumerate, ndindex, fill_diagonal,
        diag_indices, diag_indices_from)


The script is:

#!/usr/bin/env python

from __future__ import print_function
from importlib import import_module

import __future__
future_imports = dir(__future__)

def main(pkgname, *submodules):
    imports = find_imports(pkgname, submodules)
    pretty_all(imports, submodules)
    pretty_imports(imports, submodules)

def find_imports(pkgname, submodules):
    imports = {}
    for modname in submodules:
        modpath = pkgname + '.' + modname
        mod = import_module(modpath)
        mall = getattr(mod, '__all__', None)
        if mall is None:
            namespace = {}
            exec('from %s.%s import *' % (pkgname, modname), namespace)
            mall = sorted(namespace)
            mall = [n for n in mall if not n.startswith('_')]
            mall = [n for n in mall if not n in future_imports]
        imports[modname] = mall
    return imports

def pretty_imports(imports, submodules):
    for modname in submodules:
        print(pretty_import(modname, imports[modname]))
        print()

def pretty_all(imports, submodules):
    lines = ['__all__ = [']
    for modname in submodules:
        strings = ["'%s'" % name for name in imports[modname]]
        line = 4*' ' + strings[0] + ','
        for s in strings[1:]:
            new_line = line + ' ' + s + ','
            if len(new_line) <= 78:
                line = new_line
            else:
                lines.append(line)
                line = '    ' + s + ','
        lines.append(line)
        lines.append('')
    lines.append(']')
    print('\n'.join(lines))

def pretty_import(modname, names):
    line = 'from .%s import ' % modname
    names_str = ', '.join(names)
    if len(line + names_str) <= 78:
        line = line + names_str
    else:
        lines = [line + '(']
        for n, name in enumerate(names):
            space = n != 0
            comma = n != len(names) - 1
            new_line = lines[-1]
            if space:
                new_line += ' '
            new_line += name
            if comma:
                new_line += ','
            else:
                new_line += ')'
            if len(new_line) <= 78:
                lines[-1] = new_line
            else:
                next_line = 8 * ' ' + name
                if comma:
                    next_line += ','
                else:
                    next_line += ')'
                lines.append(next_line)
        line = '\n'.join(lines)
    return line

if __name__ == "__main__":
    import sys
    main(*sys.argv[1:])

--
Oscar


More information about the NumPy-Discussion mailing list