PEP 308: A PEP Writer's Experience - PRO

Andrew Dalke adalke at mindspring.com
Wed Feb 12 22:14:53 EST 2003


Erik Max Francis:
> > > I cannot recall every seeing
> > >
> > >       if (something ? this : that) ...
> > >
> > > in either C or C++ code

> In the portion you [David Eppstein] clipped, I acknowledged that I'm
> ure that that syntax is used _occasionally_, my point was just that it is
> so rare as to be totally irrelevant, especially given that the guy I was
> pointing that out to was someone who favored frequency-based
> arguments for rejecting a conditional operator.

And my - at least implicit - argument has been that without analysis
is hard to tell if a gut feeling has basis in reality.  Yours does not.
Here's an analysis of the Python codebase.

[dalke at pw600a Python-2.3a1]$ egrep 'if.*\(.*\?.*\)' */*.c | fgrep -v "'?'" |
fgrep -v '"?"'
Modules/almodule.c:     if ((port = alOpenPort(name, dir, config ?
config->config : NULL)) == NULL)
Modules/almodule.c:     if ((port = ALopenport(name, dir, config ?
config->config : NULL)) == NULL)
Modules/imgfile.c:      if ( len != xsize * ysize * (zsize == 1 ? 1 : 4) ) {
Objects/fileobject.c:   if (!PyArg_ParseTuple(args, f->f_binary ? "s#" :
"t#", &s, &n))
Objects/typeobject.c:           if (!(i == 0 ? isalpha(*p) : isalnum(*p)) &&
*p != '_') {
Python/bltinmodule.c:   else if (!PyArg_UnpackTuple(args, (op==Py_LT) ?
"min" : "max", 1, 1, &v))
[dalke at pw600a Python-2.3a1]$ wc */*.c | tail -1
 190681  579181 4740236 total
[dalke at pw600a Python-2.3a1]$ wc */*.c | tail -1

None are your exact construct, but two are close, and both of those
are more complex code.  So there's a ballpark estimate of one use
every 100,000 lines, with a large uncertainty .. requiring the analysis of
another code base.

Here's another estimate using the Linux kernel.

[dalke at pw600a linux-2.2.14]$ egrep 'if.*\(.*\?.*\)' */*.c | fgrep -v "'?'" |
fgrep -v '"?"'
ipc/sem.c:      if (ipcperms(&sma->sem_perm, alter ? S_IWUGO : S_IRUGO))
ipc/shm.c:      if (ipcperms(&shp->u.shm_perm, shmflg & SHM_RDONLY ? S_IRUGO
: S_IRUGO|S_IWUGO))
mm/mmap.c:                              if ((prev ? prev->vm_next :
mm->mmap) != vma)
[dalke at pw600a linux-2.2.14]$ wc */*.c | tail -1
  49186  163022 1218334 total
[dalke at pw600a linux-2.2.14]$

1 use in 50,000 lines, if you accept that that is close enough to your form.
Notice that I omit counting the use of ?: inside of a function call.  I am
only
looking for ?: used in the boolean expression at the top-level of the if
statement.

Going one level deeper into the kernel tree, I found the following

drivers/block/ataflop.c:        if (drive >= (MACH_IS_FALCON ? 1 : 2))
return( 0 );
drivers/block/ide-pci.c:                if ((dev->class >> 8) !=
PCI_CLASS_STORAGE_IDE || (dev->class & (port ? 4 : 1)) != 0) {
drivers/cdrom/sonycd535.c:      if ((status[0] & ((ignore_status_bit7 ? 0x7f
: 0xff) & 0x8f)) != 0)
drivers/char/n_tty.c:   } else if (tty->read_cnt >= (amt ? amt : 1))
*drivers/isdn/isdn_common.c:                                     if (*p ==
'-' ? *s <= *++p && *s >= last : *s == *p)
*drivers/net/atarilance.c:                       if (((o) <
RIEBL_RSVD_START) ? (o)+PKT_BUF_SZ > RIEBL_RSVD_START \
*drivers/net/bagetlance.c:                       if (((o) <
RIEBL_RSVD_START) ? (o)+PKT_BUF_SZ > RIEBL_RSVD_START \
drivers/net/comx-hw-comx.c:             if (off + len >= (hw->firmware ?
hw->firmware->len : 0) || len == 0) {
drivers/net/zlib.c:    if (*p == (Byte)(m < 2 ? 0 : 0xff))
drivers/scsi/aic7xxx.c:        if ( ((internal50_present ? 1 : 0) +
drivers/scsi/seagate.c:  if (target == (controller_type == SEAGATE ? 7 : 6))
drivers/scsi/seagate.c:      if (!((temp = DATA) & (controller_type ==
SEAGATE ? 0x80 : 0x40)))
*drivers/scsi/ultrastor.c:    if (config.slot ? inb(config.icm_address - 1)
== 2 :
*drivers/scsi/ultrastor.c:    if (config.slot ? inb(config.icm_address - 1)
:
drivers/sound/msnd_pinnacle.c:  if ((file ? file->f_mode : dev.mode) &
FMODE_READ) {
drivers/sound/msnd_pinnacle.c:  if ((file ? file->f_mode : dev.mode) &
FMODE_WRITE) {
drivers/sound/msnd_pinnacle.c:  if ((file ? file->f_mode : dev.mode) &
FMODE_WRITE) {
drivers/sound/msnd_pinnacle.c:  if ((file ? file->f_mode : dev.mode) &
FMODE_READ) {
drivers/sound/sb_audio.c:               if (devc->speed * devc->channels <=
(devc->major == 3 ? 23000 : 13000))
fs/ext2/truncate.c:     if (tmp != (dind_bh ? le32_to_cpu(*p) : *p)) {
fs/ext2/truncate.c:     if (tmp != (tind_bh ? le32_to_cpu(*p) : *p)) {
*net/ipv4/ipconfig.c:            if (user_dev_name[0] ? !strcmp(dev->name,
user_dev_name) :

That's 22 hits in ... 1100947 lines of C code, or one every roughly 50,000
lines of
C code.  Note that of these, 6 (marked with a "*" in the filename) are of
your exact form
  if (something ? this : that)
or about 1 every 200,000 lines.

Let's assume every occurance would correspond to an occurance in Python
code.  I earlier used a factor of 5 to convert from lines of C code to lines
of
Python code.  As such, I can estimate that

if if something: this else: that:

(or any other ternary if/else expression form) will occur about once every
40,000 lines of Python code and other uses of embeded if expressions in
an if statement will occur about once every 10,000 lines.  Here are the
other statistics I generated in order to compare

  rate if/else could be used == 1 every 400 LOC
  rate it would be used == 1 every 1,000 LOC
  rate where short circuiting is needed == 1 every 5,000 LOC
  rate if/else expression used in an if statment == 1 every 10,000 LOC
  rate where "something and this or that"
      would not suffice (larger uncertainty here)  == 1 every 25,000 LOC
  rate used as "if (if something: this else: that)" == 1 every 40,000 LOC

It's remarkably easy to do this sort of analysis using grep and
a reasonably large C/C++ codebase.  Several appropriate codebases
are available to you.

Yet you continue to argue based on gut feelings, and do not
back them up with any sort of rigour.

> I don't doubt that _someone_, _somewhere_, will misuse a Python
> conditional operator in this way.  I'm just saying that it will be so
> rare in actual practice that it's not worth worrying about.

What level would indicate a sufficiently high level of misuse to
warrant its exclusion from a future Python?  You have said that

   if (something ? this : that) ...

"is bad form in any language".  If you accept my analysis, then
the equivalent to that that will occur in Python code about 1
every 40 uses.  That means it will be used in bad form (using your
definition of bad form) about 2-3% of the time.

By my definition it will be higher because I include additional
uses of if/else expression inside of the if statement (more than just
the top-level, which would be a 10% misuse rate) and I include
cases where the if/else expression will be used in lieu of better/
more appropriate cases, as in "x = (if obj: obj else: default)"
instead of "x = obj or default".

So again I ask you, what rate of misuse is low enough as to
not be a problem?  1%?  10%?  50%?  0.01%?  Apparently
it's above 3%.

>  Say,
> weren't you the guy who was making your case with frequency
> arguments?

Yes.  You might try the same as well.

                    Andrew
                    dalke at dalkescientific.com






More information about the Python-list mailing list