Using filepath method to identify an .html page

Ferrous Cranus nikos.gr33k at gmail.com
Wed Jan 23 02:25:45 EST 2013


Τη Τρίτη, 22 Ιανουαρίου 2013 9:16:34 μ.μ. UTC+2, ο χρήστης Peter Otten έγραψε:
> Ferrous Cranus wrote:
> 
> 
> 
> > Τη Τρίτη, 22 Ιανουαρίου 2013 6:11:20 μ.μ. UTC+2, ο χρήστης Chris Angelico
> 
> > έγραψε:
> 
> 
> 
> >> all of it. You are asking something that is fundamentally
> 
> >> impossible[1]. There simply are not enough numbers to go around.
> 
> 
> 
> > Fundamentally impossible?
> 
> > 
> 
> > Well....
> 
> > 
> 
> > OK: How about this in Perl:
> 
> > 
> 
> > $ cat testMD5.pl
> 
> > use strict;
> 
> > 
> 
> > foreach my $url(qw@ /index.html /about/time.html @){
> 
> >         hashit($url);
> 
> > }
> 
> > 
> 
> > sub hashit {
> 
> >    my $url=shift;
> 
> >    my @ltrs=split(//,$url);
> 
> >    my $hash = 0;
> 
> > 
> 
> >    foreach my $ltr(@ltrs){
> 
> >         $hash = ( $hash + ord($ltr)) %10000;
> 
> >    }
> 
> >    printf "%s: %0.4d\n",$url,$hash
> 
> >    
> 
> > }
> 
> > 
> 
> > 
> 
> > which yields:
> 
> > $ perl testMD5.pl
> 
> > /index.html: 1066
> 
> > /about/time.html: 1547
> 
> 
> 
> $ cat clashes.pl 
> 
> use strict;
> 
> 
> 
> foreach my $url(qw@ 
> 
>     /public/fails.html
> 
>     /large/cannot.html
> 
>     /number/being.html
> 
>     /hope/already.html
> 
>     /being/really.html
> 
>     /index/breath.html
> 
>     /can/although.html
> 
> @){
> 
>         hashit($url);
> 
> }
> 
> 
> 
> sub hashit {
> 
>    my $url=shift;
> 
>    my @ltrs=split(//,$url);
> 
>    my $hash = 0;
> 
> 
> 
>    foreach my $ltr(@ltrs){
> 
>         $hash = ( $hash + ord($ltr)) %10000;
> 
>    }
> 
>    printf "%s: %0.4d\n",$url,$hash
> 
>    
> 
> }
> 
> $ perl clashes.pl 
> 
> /public/fails.html: 1743
> 
> /large/cannot.html: 1743
> 
> /number/being.html: 1743
> 
> /hope/already.html: 1743
> 
> /being/really.html: 1743
> 
> /index/breath.html: 1743
> 
> /can/although.html: 1743
> 
> 
> 
> Hm, I must be holding it wrong...

my @i = split(//,$url); # put each letter in it's own bin
my $j=0;   # Initailize our 
my $k=1;   # hashing increment values
my @m=();  # workspace
foreach my $n(@i){
       my $q=ord($n);  # ASCII for character
       $k += $j;       # Increment our hash offset
       $q += $k;       # add our "old" value
       $j = $k;        # store that. 
       push @m,$q;     # save the offsetted value 
}
       
my $hashval=0;  #initialize our hash value
# Generate that
map { $hashval = ($hashval + $_) % 10000} @m;


Using that method ABC.html and CBA.html now have different values because each letter position's value gets bumped up increasingly from left to right.



More information about the Python-list mailing list