processing a Very Large file
DJTB
usenet at terabytemusic.cjb.net
Wed May 18 08:38:15 EDT 2005
Tim Peters wrote:
>
>> tuple_size = int(splitres[0])+1
>> path_tuple = tuple(splitres[1:tuple_size])
>> conflicts = Set(map(int,splitres[tuple_size:-1]))
>
> Do you really mean to throw away the last value on the line? That is,
> why is the slice here [tuple_size:-1] rather than [tuple_size:]?
>
Thanks, you saved me from another bug-hunting hell...
(In a previous test version, split returned a '\n' as the last item in the
list...)
>
> You could manually do something akin to Python's "string interning" to
> store ints uniquely, like:
>
> int_table = {}
> def uniqueint(i):
> return int_table.setdefault(i, i)
>
> Then, e.g.,
>
>>>> uniqueint(100 * 100) is uniqueint(100 * 100)
> True
>>>> uniqueint(int("12345")) is uniqueint(int("12345"))
> True
>
> Doing Set(map(uniqueint, etc)) would then feed truly shared int
> (and/or long) objects to the Set constructor.
>
I've implemented this and it does seem to work, thanks.
Stan.
More information about the Python-list
mailing list