[TriPython] TriPython October 2020 Online Meeting: Merging Traffic Ahead
Calloway, Chris
cbc at unc.edu
Mon Oct 26 14:28:19 EDT 2020
https://www.meetup.com/tripython/events/274045807/
Thursday, October 29, 2020
6:00 PM to 8:00 PM EDT
This will be our second-ever online bimonthly TriPython meeting. It will consist of a featured presentation by our very own Mark Hutchinson. His presentation 'heapq is neither Heap nor Stack' will focus on a recent intriguing data matching problem that had to be solved using as little CPU and memory as possible. This will be followed by impromptu lightning talks. The meeting will be conducted via Zoom. Please RSVP for this event on meetup.com to view the Zoom link.
Title(s):
Merging Traffic Ahead
Merge is not Zip
Hey, merge(), your assumptions annoy me
heapq is neither Heap nor Stack
Abstract:
I faced a data matching problem where the runtime environment was to consume as little CPU and memory as possible. In other words, be inconspicuous and unobtrusive. The data was in two rather large files, so in-memory processing was out.
I looked at the heapq.merge() method, but it only works for ascending data. This presentation will follow my journey to be able to use heapq.merge() in ways that I didn't think possible. I was also able to expand the flexibility of the sort/sorted function.
Featured Libraries:
heapq
itertools
pandas
pandas.DataFrame.sort_values
pandas.DataFrame.drop_duplicates
pandas.DataFrame.to_csv
Featured Data Structures:
Classes
Lists
Performance Topics:
Space vs. time trade-off
I/O vs. memory trade-off
Measuring the memory of Python data structures
Chunking and buffering
Computer Science Topics:
Search (look-up)
Internal vs. External sorts
Multi-column sorting
Merge Sort
--
Sincerely,
Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530
-------------- next part --------------
[1]https://www.meetup.com/tripython/events/274045807/
Thursday, October 29, 2020
6:00 PM to 8:00 PM EDT
This will be our second-ever online bimonthly TriPython meeting. It will
consist of a featured presentation by our very own Mark Hutchinson. His
presentation 'heapq is neither Heap nor Stack' will focus on a recent
intriguing data matching problem that had to be solved using as little CPU
and memory as possible. This will be followed by impromptu lightning
talks. The meeting will be conducted via Zoom. Please RSVP for this event
on meetup.com to view the Zoom link.
Title(s):
Merging Traffic Ahead
Merge is not Zip
Hey, merge(), your assumptions annoy me
heapq is neither Heap nor Stack
Abstract:
I faced a data matching problem where the runtime environment was to
consume as little CPU and memory as possible. In other words, be
inconspicuous and unobtrusive. The data was in two rather large files, so
in-memory processing was out.
I looked at the heapq.merge() method, but it only works for ascending
data. This presentation will follow my journey to be able to use
heapq.merge() in ways that I didn't think possible. I was also able to
expand the flexibility of the sort/sorted function.
Featured Libraries:
heapq
itertools
pandas
pandas.DataFrame.sort_values
pandas.DataFrame.drop_duplicates
pandas.DataFrame.to_csv
Featured Data Structures:
Classes
Lists
Performance Topics:
Space vs. time trade-off
I/O vs. memory trade-off
Measuring the memory of Python data structures
Chunking and buffering
Computer Science Topics:
Search (look-up)
Internal vs. External sorts
Multi-column sorting
Merge Sort
--
Sincerely,
Chris Calloway
Applications Analyst
University of North Carolina
Renaissance Computing Institute
(919) 599-3530
References
Visible links
1. https://www.meetup.com/tripython/events/274045807/
More information about the TriZPUG
mailing list