Tracking Users By IP Address
Michael Foord
fuzzyman at gmail.com
Fri Oct 8 10:23:39 EDT 2004
[thanks but snip.. ;-) ]
> One alternative: (pseudocode)
>
> Recieve request
> If no-cookie-received:
> Set Cookie: "NEWUSER"
> else:
> if cookie-recieved == "NEWUSER":
> # We know they can send us cookies back
> id = gen-id()
> Set Cookie: id
>
Yep.. this I understand and will try.
Thanks
> Then just log requests with the recieved cookie, trackable users will have
> a unique id, whether their IP changes, share a system, behind nat'ing
> firewalls etc. This allows you to track unique users that are trackable
> using cookies. If you have a particularly large number of users accessing
> your site you can tie in sampling (perhaps something like density biased
> sampling) in there as well something like this:
>
> new-cookie = None
> If no-cookie-received:
> new-cookie = "NEWUSER"
> else:
> if cookie-recieved == "NEWUSER":
> # We know they can send us cookies back
> id = gen-id()
> new-cookie = id
>
> if add-to-sample-set(request):
> tag = "SAMPLE"
> new-cookie = current-cookie or new-cookie
> else:
> tag = "NOSAMPLE"
>
> if new-cookie:
> Set Cookie: tag new-cookie
>
Sorry... :-( don't get it.
What is add-to-sample-set(request) doing ? Is it simply choosing a
proportion of our users to sample ?
If this is only a 'do if you have too many users' kind-of-thing then
unfortunately it won't be a problem for me !!
> (Or something like that IYSWIM - ie get the user population to indicate if
> they're being sampled - again, this allows your users to easily opt out,
As above... I don't get it, so I don't see how it achieves this ?
> and also means the memory/etc required to determine whether to track the
> user or not isn't dependent on the number of requests your site gets -
> meaning that you can keep analysis costs for your site under control. If
> you've only got a small site this probably doesn't matter to you, but
> worth bearing in mind).
>
> The interesting thing about this from my perspective is that if you do
> take a cookie approach like this, it actually allows you to figure out how
> much error there actually is between IP and cookie - rather than just guess.
One last question. You didn't explicitly say this, but I was thinking
of doing it anyway. Are you suggesting to store USERID *and* IP
address and compare the results of anylysing by IP and analysing by
cookie.... Sounds worthwhile...
Thanks for your help - very interesting.
Regards,
Fuzzy
> The other nicety is it allows your users to opt-out very easily - since they
> can either switch off cookies, or you can send them a "NOSAMPLE" cookie.
>
> Also, at present comments in this thread revolve around "this isn't
> reliable because of x,y and z". If you take this sort of approach you
> can find out the margin of error and then decide whether you're happy
> with it or not. Also as you can see from above this doesn't really have
> to be a very complex operation (unless you're in a high volume scenario
> with lots of distinct users and need to add in the sampling aspect).
>
> Best Regards,
>
>
> Michael
More information about the Python-list
mailing list