making a typing speed tester

Wed Nov 14 14:05:40 EST 2007

On Nov 14, 11:56 am, tavspamno... at googlemail.com wrote:
> Referred here from the tutor list.
>
> > I'm trying to write a program to test someones typing speed and show
> > them their mistakes. However I'm getting weird results when looking
> > for the differences in longer (than 100 chars) strings:
>
> > import difflib
>
> > # a tape measure string (just makes it easier to locate a given index)
> > a =
> > '1-3-5-7-9-12-15-18-21-24-27-30-33-36-39-42-45-48-51-54-57-60-63-66-69
> > -72-75-78-81-84-87-90-93-96-99-103-107-111-115-119-123-127-131-135-139
> > -143-147-151-155-159-163-167-171-175-179-183-187-191-195--200'
>
> > # now with a few mistakes
> > b = '1-3-5-7-
> > l-12-15-18-21-24-27-30-33-36-39o42-45-48-51-54-57-60-63-66-69-72-75-78
> > -81-84-8k-90-93-96-9l-103-107-111-115-119-12b-1v7-131-135-139-143-147-
> > 151-m55-159-163-167-a71-175j179-183-187-191-195--200'
>
> > s = difflib.SequenceMatcher(None, a ,b)
> > ms = s.get_matching_blocks()
>
> > print ms
>
> >>>> [(0, 0, 8), (200, 200, 0)]
>
> > Have I made a mistake or is this function designed to give up when the
> > input strings get too long? If so what could I use instead to compute
> > the mistakes in a typed text?
> ---------- Forwarded message ----------
> From: Evert Rol
>
> Hi Tom,
>
> Ok, I wasn't on the list last year, but I was a few days ago, so
> persistence pays off; partly, as I don't have a full answer.
>
> I got curious and looked at the source of difflib. There's a method
> __chain_b() which sets up the b2j variable, which contains the
> occurrences of characters in string b. So cutting b to 199
> characters, it looks like this:
>     b2j= 19 {'a': [168], 'b': [122], 'm': [152], 'k': [86], 'v':
> [125], '-': [1, 3, 5, 7, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 42,
> 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93,
> 96, 99, 103, 107, 111, 115, 119, 123, 127, 131, 135, 139, 143, 147,
> 151, 155, 159, 163, 167, 171, 179, 183, 187, 191, 195, 196], 'l': [8,
> 98], 'o': [39], 'j': [175], '1': [0, 10, 13, 16, 20, 50, 80, 100,
> 104, 108, 109, 110, 112, 113, 116, 117, 120, 124, 128, 130, 132, 136,
> 140, 144, 148, 150, 156, 160, 164, 170, 172, 176, 180, 184, 188, 190,
> 192], '0': [29, 59, 89, 101, 105, 198], '3': [2, 28, 31, 32, 34, 37,
> 62, 92, 102, 129, 133, 137, 142, 162, 182], '2': [11, 19, 22, 25, 41,
> 71, 121, 197], '5': [4, 14, 44, 49, 52, 55, 74, 114, 134, 149, 153,
> 154, 157, 174, 194], '4': [23, 40, 43, 46, 53, 83, 141, 145], '7':
> [6, 26, 56, 70, 73, 76, 106, 126, 146, 166, 169, 173, 177, 186], '6':
> [35, 58, 61, 64, 65, 67, 95, 161, 165], '9': [38, 68, 88, 91, 94, 97,
> 118, 138, 158, 178, 189, 193], '8': [17, 47, 77, 79, 82, 85, 181,
> 185]}
>
> This little detour is because of how b2j is built. Here's a part from
> the comments of __chain_b():
>
>     # Before the tricks described here, __chain_b was by far the most
>     # time-consuming routine in the whole module!  If anyone sees
>     # Jim Roskind, thank him again for profile.py -- I never would
>     # have guessed that.
>
> And the part of the actual code reads:
>          b = self.b
>          n = len(b)
>          self.b2j = b2j = {}
>          populardict = {}
>          for i, elt in enumerate(b):
>              if elt in b2j:
>                  indices = b2j[elt]
>                  if n >= 200 and len(indices) * 100 > n:     # <--- !!
>                      populardict[elt] = 1
>                      del indices[:]
>                  else:
>                      indices.append(i)
>              else:
>                  b2j[elt] = [i]
>
> So you're right: it has a stop at the (somewhat arbitrarily) limit of
> 200 characters. How that exactly works, I don't know (needs more
> delving into the code), though it looks like there also need to be a
> lot of indices (len(indices*100>n); I guess that's caused in your
> strings by the dashes, '1's and '0's (that's why I printed the b2j
> string).
> If you feel safe enough and on a fast platform, you can probably up
> that limit (or even put it somewhere as an optional variable in the
> code, which I would think is generally better).
> Not sure who the author of the module is (doesn't list in the file
> itself), but perhaps you can find out and email him/her, to see what
> can be altered.
>
> Hope that helps.
>
>    Evert

I would use the time module to "time" the user. Then you should be
able to compare the original string with the user inputted string
using cmp.

<code>
# untested

start = time.time()
print 'some complicated long string'

# you should use a GUI toolkit's textbox rather than
# using a variable
user_string = raw_input('Please type the string above as quickly and
accurately as you can:\n\n')
end = time.time()
print 'amount of time to complete: %s seconds' % (end-start)

# do the comparison here
# which I am not sure how to do right now
</code>

See the following for ideas on comparing similar strings/iterables:

http://www.velocityreviews.com/forums/t345107-comparing-2-similar-strings.html

Mike