syntax difference

Bart bc at freeuk.com
Tue Jun 26 08:47:48 EDT 2018


On 26/06/2018 12:39, Chris Angelico wrote:
> On Tue, Jun 26, 2018 at 9:30 PM, Bart <bc at freeuk.com> wrote:
>> On 19/06/2018 11:33, Steven D'Aprano wrote:
>>>
>>> On Tue, 19 Jun 2018 10:19:15 +0100, Bart wrote:
>>
>>
>>> * Integer sets (Pascal-like sets)
>>>
>>> Why do you need them if you have real sets?
>>
>>
>> I tried Python sets for the first time. They seemed workable but rather
>> clunky to set up. But here is one problem on my CPython:
>>
>>     x = set(range(10_000_000))
>>
>> This used up 460MB of RAM (the original 100M I tried exhausted the memory).
>>
>> The advantage of Pascal-style sets is that that same set will occupy only
>> 1.25MB, as it is a bit-map.
>>
>> While sets will not usually be that big, there might be lots of small sets
>> and they all add up.
> 
> Cool. Make me a bitset that can represent this Python set:
> 
> {-5, -4, 6, 10, 1.5, "spam", print}

Why? It's a set of integer values with a huge range of applications.

Here's the set of characters allowed in a C identifier (not using Python 
syntax):

   cident = {'A'..'Z', 'a'..'z', '0'..'9', '_', '9'}

The characters allowed in a hex constant:

   {'0'..'9', 'A'..'F', 'a'..'f'}

A set representing every Unicode character, except those which can be C 
identifiers:

   {0..1_114_111} - cident

The latter taking only 136KB rather than 64MB as it seemed to.

I don't know whether there is a direct equivalent in Python (I thought 
somebody would point it out), apart from ways to construct similar 
functionality with bit-arrays (but then, every language can have such 
sets if you take the DIY approach).

Steven asked why I need them when there are 'real' sets, and I answered 
that.


-- 
bart



More information about the Python-list mailing list