[Tutor] Collating date data from a csv file

Peter Otten __peter__ at web.de
Thu May 9 03:53:47 EDT 2019


Cameron Simpson wrote:

> On 08May2019 21:04, Dave Hill <dave at the-hills.org.uk> wrote:
>>I have a csv file which details the results of equipment tests, I
>>carry out PAT testing as a volunteer at a heriatge railway in N.
>>Wales. I want to extract how many items were tested on each test day.
>>So far I have generated a List of test dates, but I am now stalled at
>>how to efficiently count numbers tested on each date.
>>
>>Can I have a list of tuples, where one item is the date and the second
>>the count?
> 
> Not as such, because you can't modify a tuple (so you can't update the
> count part). But you could use a 2 element list.
> 
>>or is there a better construct?
> 
> Oh definitely. The easiest thing would be a defaultdict(int). Example:
> 
>   from collections import defaultdict
>   ...
>   by_date = defaultdict(int)
>   for row in csvdata:
>     timestamp = row[1]  # based on your example data
>     # get the date from the timestamp
>     date = ...
>     by_date[date] += 1
> 
> A defaultdict is a dict which magicly makes missing elements when they
> get access, using a factory function you supply. Here we're using "int"
> as that factory, as int() returns zero.

While this is easily adaptable if you want to keep more data...

by_date = defaultdict(list)  # rows grouped by date
for row in csvdata:
   date = ...
   by_date[date].append(row)

... for the simple case there is also collections.Counter:

def get_date(row):
    return datetime.datetime.fromtimestamp(int(row[1])).date()

by_date = collections.Counter(map(get_date, csvdata))

# (date, freq) pairs ordered by frequency:
print(by_date.most_common())

> 
> I presume you've got the timestamp => date conversion sorted?
> 
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor




More information about the Tutor mailing list