[Python-ideas] IntFlags

Andrew Barnert abarnert at yahoo.com
Fri Mar 6 17:41:00 CET 2015


On Mar 6, 2015, at 1:42, Neil Girdhar <mistersheik at gmail.com> wrote:

> On Fri, Mar 6, 2015 at 4:28 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Mar 5, 2015, at 20:26, Neil Girdhar <mistersheik at gmail.com> wrote:
>> 
>>> Even if you constrain yourself to the BitFlags rather than the more general BitFields, I strongly disagree with the interface that people are proposing involving & and ~ operators.  In general, good interface design reflects the way we think about objects — not their underlying representation.
>> 
>> But sometimes the object really is "an integer used as a set of bits in some C structure/protocol field/well-known API". 
> 
> You can always get that integer by casting to integer. 
>  
>> 
>> For example, if we were designing os.open or mmap or whatever as a Pythonic interface, it wouldn't have a "flags" value that or's together multiple integers. We'd probably have separate keyword-only arguments for the less common flags, etc. But they weren't designed from scratch; they were designed to closely mirror the POSIX APIs. And that doesn't just mean simpler implementation, it means people who are familiar with those APIs know how to use them. It means the vast volumes of tutorials and sample code for opening file handles or mapping memory written for C applies to Python. And so on. So, the interface makes sense.
> 
> I disagree that there is any need to follow the style of the "vast volumes of tutorials and sample code in C" when designing Python libraries.  The goal is for the Python code to be as natural as possible.  Member access, and building constants using | are natural.  Using &~ to clear a bit is not natural;  It is a coincidence of implementation that distracts from what is happening.

For the vast majority of libraries, I agree. An XML parser or audio decoder has no need to follow cryptic C API standards.

But libraries that are designed for close-to-the-metal access can be an exception--again, consider os.open, which automatically gives you access to every *nix plafform's platform-specific features.

And wrapping C libraries that don't have much of a Python userbase can be another example. If enough people start using it, someone will write and document a higher-level Pythonic interface, but until that happens, having an interface which closely matches what people can find documentation, StackOverflow help, sample code, etc. for is a huge help. Consider PyGame. Much of it is still sparsely documented, but because it wraps the SDL APIs, you can almost always figure out what you need to do, which is part of the reason it's so popular while higher-level wrappers are not. (The other part of the reason is that it wraps almost all of the functionality of SDL, and nothing else can claim that, and again that's probably because it's a thin wrapper.) Or consider PyWin32: it has almost no documentation, and it's not at all Pythonic, but because you can look up a function on MSDN and directly use the C documentation, it's useful for all those areas of the Win32 API (and third-party COM libraries, etc.) that don't have higher-level wrappers.

And there are plenty of protocols, file formats, etc. for which the documentation is written for C (or is just a C implementation, as with the predecessor to RTSP that I forget the name of) as well.

In an ideal world, everything you wanted would have a high-level, Pythonic API--in fact, everything would be designed for Python in the first place. In the real world, you're better off with a C API than with no API at all.

>> So, one very good use for something like IntFlags is to allow people to keep using that C sample code (with trivial, easy-to-understand changes), but get better debugging, etc. when they do so--e.g., when you introspect an mmap object, it would be great if it could tell you that it was opened with PROT_READ | PROT_EXEC, instead of telling you "3", which you have to manually convert to bits and reverse-lookup in the docs or the module dict.
> 
> Yes, totally agree. 
>> 
>> Not allowing people to use C-style operations if they use named bits means that someone who wants the advantages of named bits has to rewrite their familiar C-style code. Sure, maybe the result will be more readable (although that's arguable; the suggested alternatives are pretty verbose--especially since people keep suggesting mutating-only APIs...), but it means many people will stick with plain ints rather than rewrite, and those who do rewrite will end up with code that doesn't look like the familiar code that everyone knows how to read from C.
> 
> I totally agree with you that there should not only be mutating-only functions.  I agree that | should be used for comining bit fields or flags.  However, the people who are "familiar with C" (including me) are frankly dying :)

People have been saying that for a couple decades now, but there's still tons of functionality--not just system-level stuff, but APIs for high-level things like audio fingerprinting or animating sprites or streaming video or extending a Python interpreter--that only exists in C (or sometimes C++ or ObjC), or with very thin wrappers for higher-level languages. And that's still going to be true for a long time to come.

More importantly, if C really were dead and irrelevant, there would be no need for this proposal; again, the only reason you ever care about packing flags into an int in the first place is for compatibility with C or C-style code. When you don't need that, just use a namedtuple or a set or keyword arguments or whatever in the first place.

> Pandering to the past really gets you nowhere.  Try to be a bit idealistic so that new Python code is natural, succinct, and human-readable — rather than the C values of reflecting the underlying representation in spite of the human being.
>  
>>>   The fact is that a BitSet's main operations are set and clear individual bits.  It is as if the BitFlags are a namedtuple with Boolean elements whose underlying storage happens to be an integer.
>> 
>> In the case where you don't really care that the underlying storage is an integer, why use an integer in the first place? Why not use a namedtuple, or a set, or whatever else is appropriate? In the very rare case where you need to store a million of these things (and can't store them even more compactly with array or NumPy or similar), you can go get a third-party lib; the vast majority of the time, there's no advantage to using an integer.
> 
> The main reason is so that you can cast it to "int" and produce something that some API requires. 
>> 
>> Except, of course, when the underlying representation is the whole point, because you're dealing with an API that's written in terms of integers.
> right. 
>>>   Therefore, the interface that makes the most sense is member access:
>>> 
>>> my_bit_flags.some_bit = True
>>> my_bit_flags.some_bit = False
>>> 
>>> I don't see the justification for writing these as
>>> 
>>> my_bit_flags |= TheBitFlagsClass.some_bit
>>> my_bit_flags &= ~TheBitFlagsClass.some_bit
>>> 
>>> The second line is particularly terrible because it exposes you to making mistakes like:
>>> 
>>> my_bit_flags &= TheBitFlagsClass.some_bit
>>> my_bit_flags |= ~TheBitFlagsClass.some_bit
>>> 
>>> — both of which are meaningless.
>> 
>> No they're not. Put some real names instead of toy names there:
>> 
>>     readable = m.prot
>>     readable &= ProtFlags.Readable
>> 
>> Now it's true iff m.prot includes the Readable flag.
>> 
>> Of course usually you'd write this in a single line without mutation:
>> 
>>     readable = m.prot & ProtFlags.Readable
> 
> We both know that the most readable version is just member access, like you would on any object:
> 
> readable = m.prot.readable
> 
> This usage of & to filter is unnecessarily complicated.  The fact that the machine does so is no reason for the programmer to write it so.

Right, so someone should write a higher-level library that wraps up mmap so you don't have to use it. But no one has done so yet, and if you want to use it without waiting another couple decades until someone gets around to it, you're using the C-style API.

>> But that just goes to show that the primary interface of bit flags is an immutable one; trying to force people to use mutating methods like set_bit and clear_bit is just getting in people's way. (And try to come up with a good name for the non-mutating operation that's obvious and reads like English and isn't approaching the ridiculous Apple level of verbosity you get in Cocoa methods like "bitSetWithBitClear:".)
> 
> I agree with you here.  I think you should also have | so that you can build constants the way you're used to, although I'm not sure about & since I don't see when you would use it in preference to member access.

OK, if you have | and &, you automatically have |= and &=. There's no way to implement the former without automatically getting the latter. So if that's your suggestion, it's not possible in the first place, so you have to choose whether we get both or neither.

>>> It also makes it hard to convert code between the alternate implementation of using a namedtuple.  It should be easy to do that in my opinion.
>  
>>> 
>>> Best,
>>> 
>>> Neil
>>> 
>>> On Thu, Mar 5, 2015 at 12:57 PM, Serhiy Storchaka <storchaka at gmail.com> wrote:
>>>> On 05.03.15 19:29, Neil Girdhar wrote:
>>>>> Have you looked at my IntFields generalization of IntFlags?  It seems
>>>>> that many of your examples (permissions, e.g.) are better expressed with
>>>>> fields than with flags.
>>>> 
>>>> It looks too complicated for such simple case. And it has an interface incompatible with plain int.
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Python-ideas mailing list
>>>> Python-ideas at python.org
>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>> 
>>>> -- 
>>>> 
>>>> --- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>>>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/L5KfCEXFaII/unsubscribe.
>>>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150306/fd6adaeb/attachment-0001.html>


More information about the Python-ideas mailing list