Reading a file into a data structure....

Troy S tdsimpson at gmail.com
Sat Oct 15 21:48:57 EDT 2011


Chris,
Thanks for the help.
I am using the powerball numbers from this text file downloaded from the site.
http://www.powerball.com/powerball/winnums-text.txt
The first row is the header/fieldnames and the file starts off like this:

Draw Date   WB1 WB2 WB3 WB4 WB5 PB  PP
10/12/2011  43  10  12  23  47  18  3
10/08/2011  35  03  37  27  45  31  5
10/05/2011  46  07  43  54  20  17  4
10/01/2011  27  43  12  23  01  31  3
09/28/2011  41  51  30  50  53  08  2
09/24/2011  27  12  03  04  44  26  5
09/21/2011  47  52  55  48  12  13  4

The testing of a digit was used to skip the first row only.

I'm stil dissecting your Python code to better understand the use of
collection, namedtuples, etc.
I have not found many examples/descriptions yet about collections,
namedtuples, etc.  I don't quite understand them that much.  Do you
know of a reference that can break this stuff down better for me?
The couple of books that I have on Python do not go into collection,
namedtuples, etc that much.

Thanks,

On Sat, Oct 15, 2011 at 12:47 AM, Chris Rebert <clp2 at rebertia.com> wrote:
> On Fri, Oct 14, 2011 at 7:59 PM, MrPink <tdsimpson at gmail.com> wrote:
>> This is what I have been able to accomplish:
>>
>> def isInt(s):
>>    try:
>>        i = int(s)
>>        return True
>>    except ValueError:
>>        return False
>>
>> f = open("powerball.txt", "r")
>> lines = f.readlines()
>> f.close()
>>
>> dDrawings = {}
>> for line in lines:
>>    if isInt(line[0]):
>>        t = line.split()
>>        d = t[0]
>>        month,day,year = t[0].split("/")
>>        i = int(year + month + day)
>>        wb = t[1:6]
>>        wb.sort()
>>        pb = t[6]
>>        r = {'d':d,'wb':wb,'pb':pb}
>>        dDrawings[i] = r
>>
>> The dictionary dDrawings contains records like this:
>> dDrawings[19971101]
>> {'pb': '20', 'd': '11/01/1997', 'wb': ['22', '25', '28', '33', '37']}
>>
>> I am now able to search for ticket in a date range.
>> keys = dDrawings.keys()
>> b = [key for key in keys if 20110909 <= key <= 20111212]
>>
>> How would I search for matching wb (White Balls) in the drawings?
>>
>> Is there a better way to organize the data so that it will be flexible
>> enough for different types of searches?
>> Search by date range, search by pb, search by wb matches, etc.
>>
>> I hope this all makes sense.
>
> from datetime import datetime
> from collections import namedtuple, defaultdict
> # for efficient searching by date: import bisect
>
> DATE_FORMAT = "%m/%d/%Y"
> Ticket = namedtuple('Ticket', "white_balls powerball date".split())
>
> powerball2ticket = defaultdict(set)
> whiteball2ticket = defaultdict(set)
> tickets_by_date = []
>
> with open("powerball.txt", "r") as f:
>    for line in f:
>        if not line[0].isdigit():
>            # what are these other lines anyway?
>            continue # skip such lines
>
>        fields = line.split()
>
>        date = datetime.strptime(fields[0], DATE_FORMAT).date()
>        white_balls = frozenset(int(num_str) for num_str in fields[1:6])
>        powerball = int(fields[6])
>        ticket = Ticket(white_balls, powerball, date)
>
>        powerball2ticket[powerball].add(ticket)
>        for ball in white_balls:
>            whiteball2ticket[ball].add(ticket)
>        tickets_by_date.append(ticket)
>
> tickets_by_date.sort(key=lambda ticket: ticket.date)
>
> print(powerball2ticket[7]) # all tickets with a 7 powerball
> print(whiteball2ticket[3]) # all tickets with a non-power 3 ball
>
>
> Cheers,
> Chris
> --
> http://rebertia.com
>



-- 
Troy S



More information about the Python-list mailing list