Why Python don't accept 03 as a number?

Mon Dec 10 21:46:25 EST 2018

On Tue, Dec 11, 2018 at 1:12 PM Avi Gross <avigross at verizon.net> wrote:
> I would then "generate" all possible combinations of digits 0-9 of length N. There are an amazing number of ways to do that ranging from taking a range(10**N) and converting it to a string then a list of numeral characters then tossing out any that have any duplicates, to creating a recursive function that passes along what has been used so far to the next call. Since many of these problems only need ONE solution, the generator would only iterate as many times as needed and no giant lists of numbers or strings need be in memory at any time.
>

First off, you can slash that number significantly, as there are only
10! (not 10**10) possibilities. Secondly, you can eliminate zero from
consideration from any letter that starts a sequence other than a
single digit, as demonstrated previously. So there are 9! * x, where x
is the number of letters that could potentially be zero. That cuts
your total possibilities from 10,000,000,000 down to, at worst,
3,628,800.

> For each tuple returned  by the generator, you can iterate on the set of unique letters and use string functions to substitute the digits, or perhaps do this all at once. You would do this to all the items in what I call the left list as well as all the items on the right list. These would not be "numeric" so using int() on each item  would work EVEN with leading zeroes. Seems safe enough.
>

Use str.replace() and int(), job done.

> Yes, this is brute force. Using range(1,000,000) (no commas in real use)

Use underscores instead:

>>> range(1_000_000)
range(0, 1000000)

> would be a million iterations when there are six unique characters in the puzzle and as many as 10 billion if all ten are in use. If you use nested loops like I showed a while ago (albeit to make arbitrary ones for many sizes could require multiple functions or use of an eval on one built by hand) you can cut down the number of iterations as the nested loops count down with each one doing one less than the one before it. Same goes for the recursive function call method as it passes along what numerals  have already been used. There may already be a permutation function that does this efficiently in C.
>

Yeah, check itertools :)

> The real cost that dominates here is not memory, albeit garbage collection may be busy as it generates lots of temporary small bits of data. It is how the number of iterations grows.

Correct. And that's why a pure brute-force solution needs some
refinement. Algorithmic improvements almost always trump mechanical
improvements.

> I have looked at a somewhat related issue of how you generate all possible SCRABBLE words (or BOGGLE or ...) given specific starting letters.
> One way to make all possible combinations is along the lines above with many needed changes as there can (in theory) be as many as 26 unique letters (in English) and you can have multiple copies of some letters. If you allow other languages, the problem grows to the point where brute force is not realistic.
> And, ideally, you winnow down your choices by checking each word against a large dictionary.

Hmm, IMO that's the wrong way around. Instead, *start* with the
dictionary, and winnow down the possibilities to the words you have. A
decent word list will probably have 100K-1M words in it, which is a
small enough corpus to go through them all.

> Anyone know of one that has a decent selection and can be freely imported? I mean one word per line.
>

What OS are you on? For my usage, I just read from
/usr/share/dict/words and it's usually there. On Debian Linux, that's
provided by the dictionaries-common package, or the wbritish or
wamerican Ditch-side-specific packages. In fact, using that file makes
your code independent of language (although you may need to concern
yourself with different alphabets if you want to support
Russian/Greek, and anything involving "letters" doesn't really work
with Chinese), so I would strongly recommend it.

On Windows, where that path doesn't exist, and there probably aren't
standard dictionaries, you could download one of them from
wordlist.aspell.net or wordlist.sourceforge.net - they're freely
available, but you would have to go fetch them.

> I apologize for the length. The main question was whether eval is particularly expensive.

Well, yes it IS expensive... but the cost of it is less significant
than the algorithm and consequent number of iterations/attempts. Using
eval() on three million possibilities is going to be WAY cheaper than
a more efficient calculation technique used on ten billion.

Write your code with two main priorities:

1) Get your algorithm right
2) Express your algorithm cleanly in code.

Don't worry about performance until you've done the above two steps,
*measured*, and found that it's taking too long.

ChrisA