base conversion

Robert Roy rjroy at takingcontrol.com
Sat Oct 7 10:18:44 EDT 2000


On Fri, 6 Oct 2000 15:38:13 -0400, "Joel Lucsy"
<jjlucsy at concentric.net> wrote:

>Yes, but standard compression routines like zlib or lzo don't work well on
>short sequences. I'd rather treat the string as a decimal rather than a
>string and simply (well, it seems simple) convert the base. Perhaps a
>different method would work, but my brain just doesn't want to descramble
>how to do it.

If you are going to leave the value as a decimal then why bother at
all? Just use it. The only benefit to representing the number in a
different base is if you are going to store it as a string. That said,
you are not going to gain any radical changes is sting length. It is
the nature of the alphabets that we use. Reasonably you can expect to
represent an integer as a base 36 string 0-9+a-z. This is what the C
library defines. Anything beyond that is really tricky since  you then
have to rely on case and punctuation characters. Note that to effect a
length reduction of half you would need a 100 character alpahbet.
Possible but VERY unclean. You can however get a safe reduction of
about 40% by using base36. 


Here is some really quick and dirty code to show this. You can extend
the nums array even further but you will see that you rapidly hit the
point of diminishing returns.

l = 5273030383518181846929222846464626075757L
nums =
'0123456789abcdefhghijkllmnopqrstuwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
 
def makeBase(x, base = len(nums), table=nums):
   d, m = divmod(x, base)
   if d:
       return makeBase(d,base, table) + table[m]
   else:
       return table[m] 
       
print 10,  l, len(str(l))

for b in [16, 24, 36, 48, len(nums)]:
    s = makeBase(l, base=b)
    print b, s, len(s)




>
>----- Original Message -----
>From: "Allan M. Wind" <allanwind at mediaone.net>
>To: "Joel Lucsy" <jjlucsy at concentric.net>
>Cc: <python-list at python.org>
>Sent: Friday, October 06, 2000 3:28 PM
>Subject: Re: base conversion
>
>
>> On 2000-10-06 14:26:18, Joel Lucsy wrote:
>>
>> > Forgot to mention that I'd like the output to be considerably shorter
>than
>> > the original. I have already tried the base64 module.
>>
>> Perhaps, you want to compress the number rather than changing base?
>>
>>
>> /Allan
>> --
>> Allan M. Wind email: allanwind at mediaone.net
>> P.O. Box 2022 finger: awind at digit-safe.dyndns.org (GPG/PGP)
>> Woburn, MA 01888-0022 icq: 44214251
>> USA
>>
>
>




More information about the Python-list mailing list