[Tutor] Collating date data from a csv file
Peter Otten
__peter__ at web.de
Thu May 9 03:53:47 EDT 2019
Cameron Simpson wrote:
> On 08May2019 21:04, Dave Hill <dave at the-hills.org.uk> wrote:
>>I have a csv file which details the results of equipment tests, I
>>carry out PAT testing as a volunteer at a heriatge railway in N.
>>Wales. I want to extract how many items were tested on each test day.
>>So far I have generated a List of test dates, but I am now stalled at
>>how to efficiently count numbers tested on each date.
>>
>>Can I have a list of tuples, where one item is the date and the second
>>the count?
>
> Not as such, because you can't modify a tuple (so you can't update the
> count part). But you could use a 2 element list.
>
>>or is there a better construct?
>
> Oh definitely. The easiest thing would be a defaultdict(int). Example:
>
> from collections import defaultdict
> ...
> by_date = defaultdict(int)
> for row in csvdata:
> timestamp = row[1] # based on your example data
> # get the date from the timestamp
> date = ...
> by_date[date] += 1
>
> A defaultdict is a dict which magicly makes missing elements when they
> get access, using a factory function you supply. Here we're using "int"
> as that factory, as int() returns zero.
While this is easily adaptable if you want to keep more data...
by_date = defaultdict(list) # rows grouped by date
for row in csvdata:
date = ...
by_date[date].append(row)
... for the simple case there is also collections.Counter:
def get_date(row):
return datetime.datetime.fromtimestamp(int(row[1])).date()
by_date = collections.Counter(map(get_date, csvdata))
# (date, freq) pairs ordered by frequency:
print(by_date.most_common())
>
> I presume you've got the timestamp => date conversion sorted?
>
> Cheers,
> Cameron Simpson <cs at cskk.id.au>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list