Is my implementation of happy number OK

Thu Apr 30 13:37:10 EDT 2015

On Thu, Apr 30, 2015 at 9:59 AM, Cecil Westerhof <Cecil at decebal.nl> wrote:
> I implemented happy_number function:
>     _happy_set      = { '1' }
>     _unhappy_set    = set()
>
>     def happy_number(n):
>         """
>         Check if a number is a happy number
>         https://en.wikipedia.org/wiki/Happy_number
>         """
>
>         def create_current(n):
>             current_array = sorted([value for value in str(n) if value != '0'])

A generator expression in place of the list comprehension would avoid
creating an intermediate list, which would save memory on extremely
large numbers. Not sure without testing how it would compare on small
numbers. For reference, since size here refers to number of digits,
1E8 would be considered tiny.

Also for large numbers, there might be a smarter data structure to use
than just sorting the digits, which is O(n log n). Simplest would
probably just be a 9-element tuple containing the count of each
non-zero digit, which would be only O(n) to build.

>             return (current_array, ''.join(current_array))

You don't seem to be actually using current_array for anything, so why
not just return the string?

>         global _happy_set
>         global _unhappy_set
>
>         current_run         = set()
>         current_array, \
>             current_string  = create_current(n)

As a stylistic rule, avoid line continuations using \. Prefer using
unclosed parentheses for line continuations, e.g. the above could be
written as:

    (current_array,
     current_string) = create_current(n)

But really, the above is short enough it could just be written on a single line.

Also, I feel like the prefix "current_" is used so much here that it
loses its meaning. All these variable names would be better without
it.

>         if current_string in _happy_set:
>             return True
>         if current_string in _unhappy_set:
>             return False

Instead of two sets, consider using a single _happy_dict with values
True for happy and False for unhappy. Then the above becomes:

    if current_string in _happy_dict:
        return _happy_dict[current_string]

>         while True:
>             current_run.add(current_string)
>             current_array, \
>                 current_string = create_current(sum([int(value) ** 2
>                                                      for value in current_string]))

Since there are only 9 values that get squared, you could precompute
the squares to avoid squaring them over and over again.

>             if current_string in current_run | _unhappy_set:

In case the sets get large it might be better to avoid creating the
union and just do:

    if current_string in current_run or current_string in _unhappy_set:

>                 _unhappy_set |= current_run
>                 return False
>             if current_string in _happy_set:
>                 _happy_set |= current_run
>                 return True