Flatten... or How to determine sequenceability?
Alex Martelli
aleaxit at yahoo.com
Fri May 25 12:08:35 EDT 2001
"Noel Rappin" <noelrap at yahoo.com> wrote in message
news:mailman.990802119.12058.python-list at python.org...
> I'm writing code that needs to flatten a multi-dimension list or tuple
into
> a single dimension list.
>
> [1, [2, 3], 4] => [1, 2, 3, 4]
>
> As part of the algorithm, I need to determine whether each object in the
> list is itself a sequence or whether it is an atom. Using the types
module
> would cause me to miss any sequence-like object that isn't actually the
> basic type. So, what's the best (easiest, most foolproof) way to
determine
> whether a Python object is a sequence?
Hmmm -- good question. Checking whether x[:] raises an exception
is reasonable, but a mapping that accepts slice objects would fool
this test. I _suspect_ the most-foolproof test today might be:
def is_sequence(x):
try:
for _ in x:
return 1
else: return 1 # empty sequences are sequences too
except: return 0
However, if I understand correctly, this would break in 2.2,
since mappings then also become usable in a for statement,
not just sequences. This would then incorrectly identify
_any_ mapping as "a sequence", too. What *IS* there that
you can do ONLY to a sequence, ANY sequence, but NOT to
any mapping...? zip(), which seems like it could serve,
apparently does NOT test its arguments for "sequencehood"
(well, depending how you define that, I guess...): it
does a PySequence_GetItem, and fails iff that fails with
other than an IndexError.
>>> class x:
... def __getitem__(self, k): return k
...
>>> a=x()
>>> zip(a,'x')
[(0, 'x')]
map() and friends would be slow on long sequences, AND do
require a sequence to have a length to accept it, too.
I'm almost tempted to propose a tiny extension module, since
the C-API level *DOES* expose an "is-a-sequence" test...:
#include "Python.h"
static PyObject *is_seq(PyObject* self, PyObject* args) {
int i, rc;
PyObject *arg;
if(!PyArg_ParseTuple(args, "O", &arg)) return 0;
return Py_BuildValue("i",PySequence_Check(arg));
}
static PyMethodDef sq_module_functions[] =
{{ "is_seq", (PyCFunction)is_seq, METH_VARARGS },
{ 0, 0 }};
voidinitsq(void) {
PyObject* sq_module = Py_InitModule("sq", sq_module_functions);
}
and of course the setup.py to go with it:
from distutils.core import setup, Extension
sq_ext=Extension('sq', sources=['sq.c'])
setup(name="sq", ext_modules=[sq_ext])
but that doesn't seem fair...
...plus, it ALSO gets thrown by above-exemplified
object a from class x...!
So, maybe zip IS best after all, something like:
def is_sequence(x):
try: zip(x,'')
except: return 0
else: return 1
Oh BTW -- all of these methods will see a string or
unicode object as a 'sequence', because it IS one by
Python's rules (you can index and slice it, loop on
it with for, &c). Most often in such tasks one
wants to consider strings as atoms, but that will
require some further special-casing, anyway.
Alex
More information about the Python-list
mailing list