processing a Very Large file

DJTB usenet at terabytemusic.cjb.net
Wed May 18 08:38:15 EDT 2005


Tim Peters wrote:

> 
>>        tuple_size = int(splitres[0])+1
>>        path_tuple = tuple(splitres[1:tuple_size])
>>        conflicts = Set(map(int,splitres[tuple_size:-1]))
> 
> Do you really mean to throw away the last value on the line?  That is,
> why is the slice here [tuple_size:-1] rather than [tuple_size:]?
> 

Thanks, you saved me from another bug-hunting hell...
(In a previous test version, split returned a '\n' as the last item in the
list...)

> 
> You could manually do something akin to Python's "string interning" to
> store ints uniquely, like:
> 
>     int_table = {}
>     def uniqueint(i):
>         return int_table.setdefault(i, i)
> 
> Then, e.g.,
> 
>>>> uniqueint(100 * 100) is uniqueint(100 * 100)
> True
>>>> uniqueint(int("12345")) is uniqueint(int("12345"))
> True
> 
> Doing Set(map(uniqueint, etc)) would then feed truly shared int
> (and/or long) objects to the Set constructor.
> 

I've implemented this and it does seem to work, thanks.

Stan.



More information about the Python-list mailing list