How to print all expressions that match a regular expression

hzhuo1 at gmail.com hzhuo1 at gmail.com
Sat Feb 6 23:16:09 EST 2010


>
> So it seems we both misunderstood the problem.
>
> I didn't read the top level article until now, and reading it, I can't make
> sense of it.
>

Seems that you should read the whole thing before making a post, or
else you cannot know what we are talking about.
Steven doesn't misunderstand me. We are talking about what I need, and
he tries to help.



>
> > "Given the function hashlib.sha256, enumerate all the possible inputs
> > that give the hexadecimal result
> > 0a2591aaf3340ad92faecbc5908e74d04b51ee5d2deee78f089f1607570e2e91."
>
> I tried some "parrot" variants but no dice. :-(
>
> [snip]
>

This is a hash collision problem. Nobody has proved that SHA-256 is
collision free, even not in the random oracle model, because people
always suppose that a random oracle exists, and make hash function its
substitution. That means it may be broken someday. And any provable
security based on random oracle model is not secure.


> > I'm suggesting that, in general, there's no way to tell in advance which
> > regexes will be easy and which will be hard, and even when they are easy,
> > the enumeration will often be infinite.

It is hard to tell in advance. However, we can add some timing limit
or counting limit, to make it an algorithm, which can halt. For
example, whenever the program outputs more than 1000000 expressions
that match the input regex, we can halt because that exceeds our
limit. But surely this is not efficient because of the post-decision.



>
> Essentially, any regexp that includes '+' or '*' (directly or via e.g. notation
> that denotes "digit sequence") yields an infinite number of strings.

Infinity is really relative, not absolute. It is relative to the
computing speed. For example, the regex '^[0|1]{2048}$' is rather
simple and doesn't contain '+' or '$', but trying to output all
expressions that match it has a complexity of 2^2048. If we can do
that, then we can break RSA-2048.
We must face the reality .

Zhuo




More information about the Python-list mailing list