Finding Peoples' Names in Files

Chris Mellon arkanes at gmail.com
Thu Oct 11 15:43:12 EDT 2007


On 10/11/07, byte8bits at gmail.com <byte8bits at gmail.com> wrote:
> On Oct 11, 12:49 pm, Matimus <mccre... at gmail.com> wrote:
> > On Oct 11, 9:11 am, brad <byte8b... at gmail.com> wrote:
> >
> >
> >
> > > cokofree... at gmail.com wrote:
> > > > However...how can you know it is a name...
> >
> > > OK, I admitted in my first post that it was a crazy question, but if one
> > > could find an answer, one would be onto something. Maybe it's not a 100%
> > > answerable question, but I would guess that it is an 80% answerable
> > > question... I just don't know how... yet :)
> >
> > > Besides admitting that it's a crazy question, I should stop and explain
> > > how it would be useful to me at least. Is a credit card number itself
> > > valuable? I would think not. One can easily re and luhn check for credit
> > > card numbers located in files with a great degree of accuracy, but a
> > > number without a name is not very useful to me. So, if one could
> > > associate names to luhn checked numbers automatically, then one would be
> > > onto something. Or at least say, "hey, this file has luhn validated CCs
> > > *AND* it seems to have people's names in it as well." Now then, I'd have
> > > less to review or perhaps as much as I have now, but I could push the
> > > files with numbers and names to the top of the list so that they would
> > > be reviewed first.
> >
> > > Brad
> >
> > What the hell are you doing? Your post sounds to me like you have a
> > huge amount of stolen, or at the very least misapprehended, data. Now
> > you want to search it for credit card numbers and names so that you
> > can use them.
> >
> > I am not cool with this! This is a public forum about a programming
> > language. What makes you think that anybody in this forum will be cool
> > with that. Perhaps you aren't doing anything illegal, but it sure is
> > coming off that way. If you are doing something illegal I hope you get
> > caught.
> >
> > At the very least, you might want to clarify why you are looking for
> > such capability so that you don't get effectively black-listed (well,
> > by me at least).
> >
> > Matt
>
> Go have a beer and calm down a bit :) It's a legitimate purpose,
> although it could (and probably is being used by bad guys right now).
> My intent, as you can see from the links below, is to catch it before
> the bad guys do.
>
> http://filebox.vt.edu/users/rtilley/public/find_ccns/
> http://filebox.vt.edu/users/rtilley/public/find_ssns/
>
> Brad
>

In case you're doing this for PCI validation, be aware that just the
CC number is considered sensitive and you'd get some false negatives
if you filter on anything except that.

Random strings that match CC checksums are really quite rare and false
positives from that alone are unlikely to be a problem. Unless I
deployed this and there was a significant false positive rate I
wouldn't risk the false negatives, personally.



More information about the Python-list mailing list