Linear time baseconversion

Chris Angelico rosuav at gmail.com
Tue Jun 30 12:10:15 EDT 2015


On Wed, Jul 1, 2015 at 1:45 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> On Tue, Jun 30, 2015 at 9:40 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
>> On Tue, Jun 30, 2015 at 3:07 AM, Christian Gollwitzer <auriocus at gmx.de> wrote:
>>> Am 30.06.15 um 10:52 schrieb jonas.thornvall at gmail.com:
>>>>
>>>> It still bug out on very big numbers if base outside integer scope.
>>>> I am very keen on suggestions regarding the logic to make it faster.
>>>
>>>
>>> Concerning the algorithmic complexity, it can't be faster than square time
>>> in the number of digits N. Baseconversion needs to do a sequence of division
>>> operations, where every operation gves you one digit in the new base. The
>>> number of digits in the new base is proportional to the number of digits in
>>> the old base (the ratio is log b1/log b2). Therefore it will be O(N^2).
>>
>> I don't think that's true. Here's a linear hexadecimal to binary function:
>>
>>>>> def hextobin(value):
>> ...     digits = {'0': '0000', '1': '0001', '2': '0010', '3': '0011',
>> ...         '4': '0100', '5': '0101', '6': '0110', '7': '0111',
>> ...         '8': '1000', '9': '1001', 'A': '1010', 'B': '1011',
>> ...         'C': '1100', 'D': '1101', 'E': '1110', 'F': '1111'}
>> ...     return ''.join(digits[d.upper()] for d in value)
>> ...
>>>>> hextobin('3f')
>> '00111111'
>>
>> I believe this approach can be extended to arbitrary bases with some
>> effort, although for converting arbitrary base b1 to b2, you would
>> need up to b2 different mappings if b1 and b2 are relatively prime.
>
> Actually, I think you need up to b1 * b2 mappings, as you're
> effectively building a state machine with b1 * b2 states. The mappings
> can be pre-computed, however, so actually running the state machine
> would then just be a linear algorithm.

Arbitrary base conversion has to be stateful. You can take a shortcut
like this when the bases are related (eg binary, octal, hexadecimal),
but otherwise, you need the division. Consider what happens when you
convert the binary digit "1" to decimal, and then follow it with
varying numbers of zeros:

1, 2, 4, 8, 16, 32, 64,... 32768, 65536, 131072,... 1048576,... 1073741824,...

You can certainly do some useful analyses on the last digits (they'll
never be anything other than 2, 4, 8, 6, except for the special case
of 1 itself), but there's a lot of variation in the intermediate
digits.

When there's a simple ratio between the bases, it's fairly
straight-forward to convert a few digits at a time. Converting base
256 into base 64, for instance, can be done by taking three digits and
yielding four. But within that, you would still need a complete table
of all sixteen million possibilities, if you want to do the lookup
table. And that only works when there is that kind of relationship.

ChrisA



More information about the Python-list mailing list