[Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)

Andrew Barnert abarnert at yahoo.com
Mon Jan 13 02:16:20 CET 2014


See http://bugs.python.org/issue20230 for the issue and patch. Thanks to Ethan Furman for telling me to post it there instead of here.


----- Original Message -----
> From: Andrew Barnert <abarnert at yahoo.com>
> To: Andrew Barnert <abarnert at yahoo.com>; "python-ideas at python.org" <python-ideas at python.org>
> Cc: 
> Sent: Sunday, January 12, 2014 4:32 PM
> Subject: Re: [Python-ideas] Making PyStructSequence expose _fields (was Re: namedtuple base class)
> 
> Here's a quick patch:
> 
> diff -r bc5f257f5cc1 Lib/test/test_structseq.py
> --- a/Lib/test/test_structseq.pySun Jan 12 14:12:59 2014 -0800
> +++ b/Lib/test/test_structseq.pySun Jan 12 16:31:15 2014 -0800
> @@ -28,6 +28,16 @@
>          for i in range(-len(t), len(t)-1):
>              self.assertEqual(t[i], astuple[i])
>  
> +    def test_fields(self):
> +        t = time.gmtime()
> +        self.assertEqual(t._fields,
> +                         ('tm_year', 'tm_mon', 
> 'tm_mday', 'tm_hour', 'tm_min', 
> +                          'tm_sec', 'tm_wday', 
> 'tm_yday', 'tm_isdst'))
> +        st = os.stat(__file__)
> +        self.assertIn("st_mode", st._fields)
> +        self.assertIn("st_ino", st._fields)
> +        self.assertIn("st_dev", st._fields)
> +
>      def test_repr(self):
>          t = time.gmtime()
>          self.assertTrue(repr(t))
> diff -r bc5f257f5cc1 Objects/structseq.c
> --- a/Objects/structseq.cSun Jan 12 14:12:59 2014 -0800
> +++ b/Objects/structseq.cSun Jan 12 16:31:15 2014 -0800
> @@ -7,6 +7,7 @@
>  static char visible_length_key[] = "n_sequence_fields";
>  static char real_length_key[] = "n_fields";
>  static char unnamed_fields_key[] = "n_unnamed_fields";
> +static char _fields_key[] = "_fields";
>  
>  /* Fields with this name have only a field index, not a field name.
>     They are only allowed for indices < n_visible_fields. */
> @@ -14,6 +15,7 @@
>  _Py_IDENTIFIER(n_sequence_fields);
>  _Py_IDENTIFIER(n_fields);
>  _Py_IDENTIFIER(n_unnamed_fields);
> +_Py_IDENTIFIER(_fields);
>  
>  #define VISIBLE_SIZE(op) Py_SIZE(op)
>  #define VISIBLE_SIZE_TP(tp) PyLong_AsLong( \
> @@ -327,6 +329,7 @@
>      PyMemberDef* members;
>      int n_members, n_unnamed_members, i, k;
>      PyObject *v;
> +    PyObject *_fields;
>  
>  #ifdef Py_TRACE_REFS
>      /* if the type object was chained, unchain it first
> @@ -389,6 +392,19 @@
>      SET_DICT_FROM_INT(real_length_key, n_members);
>      SET_DICT_FROM_INT(unnamed_fields_key, n_unnamed_members);
>  
> +    _fields = PyTuple_New(desc->n_in_sequence);
> +    if (!_fields)
> +        return -1;
> +    for (i = 0; i != desc->n_in_sequence; ++i) {
> +        PyObject *field = PyUnicode_FromString(members[i].name);
> +        PyTuple_SET_ITEM(_fields, i, field);
> +    }
> +    if (PyDict_SetItemString(dict, _fields_key, _fields) < 0) {
> +        Py_DECREF(_fields);
> +        return -1;
> +    }
> +    Py_DECREF(_fields);
> +
>      return 0;
>  }
>  
> @@ -417,7 +433,8 @@
>  {
>      if (_PyUnicode_FromId(&PyId_n_sequence_fields) == NULL
>          || _PyUnicode_FromId(&PyId_n_fields) == NULL
> -        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL)
> +        || _PyUnicode_FromId(&PyId_n_unnamed_fields) == NULL
> +        || _PyUnicode_FromId(&PyId__fields) == NULL)
>          return -1;
>  
>      return 0;
> 
> 
> 
> 
> ----- Original Message -----
>>  From: Andrew Barnert <abarnert at yahoo.com>
>>  To: "python-ideas at python.org" <python-ideas at python.org>
>>  Cc: 
>>  Sent: Sunday, January 12, 2014 4:17 PM
>>  Subject: [Python-ideas] Making PyStructSequence expose _fields (was Re: 
> namedtuple base class)
>> 
>>  I don't think the proposed NamedTuple ABC adds anything on top of duck 
>>  typing on _fields (or on whichever other method you need, and possibly 
> checking 
>>  for Sequence). As Raymond Hettinger summarized it nicely, namedtuple is a 
>>  protocol, not a type.
>> 
>>  But I think one of the ideas that came out of that discussion is worth 
> pursuing 
>>  on its own: giving a _fields member to every structseq type.
>> 
>>  Most of the namedtuple-like classes in the builtins/stdlib, like 
> os.stat_result, 
>>  are implemented with PyStructSequence. Since 3.3, that's been a public, 
> 
>>  documented protocol. A structseq type is already a tuple. And it stores all 
> the 
>>  information needed to expose the fields to Python, it just doesn't 
> expose 
>>  them in any way. And making it do so is easy. (Either add it to the type 
>>  __dict__ at type creation, or add a getter that generates it on the fly 
> from 
>>  tp_members.)
>> 
>>  Of course a structseq can do more than a namedtuple. In particular, using a 
> 
>>  structseq via its _fields would mean that you miss its 
> "non-sequence" 
>>  fields, like st_mtime_ns. But then that's already true for using a 
> structseq 
>>  as a sequence, or just looking at its repr, so I don't think that's 
> a 
>>  problem. (The "visible fields" are visible for a reason…)
>> 
>>  And this still wouldn't mean that _fields is part of the "named 
> tuple 
>>  protocol" described in the glossary, just that it's part of 
> structseq 
>>  types as well as collections.namedtuple types.
>> 
>>  And this wouldn't give structseq an on-demand __dict__ so you can just 
> call 
>>  var(s) instead of OrderedDict(zip(s._fields, s)).
>> 
>>  Still, it seems like a clear win. A small patch, a bit of extra storage on 
> each 
>>  structseq type object (not on the instances), and now you can reflect on 
> the 
>>  most common kind of C named tuple types the same way you do on the most 
> common 
>>  kind of Python named tuple types.
>>  _______________________________________________
>>  Python-ideas mailing list
>>  Python-ideas at python.org
>>  https://mail.python.org/mailman/listinfo/python-ideas
>>  Code of Conduct: http://python.org/psf/codeofconduct/
>> 
> 


More information about the Python-ideas mailing list