which library has map reduce and how to use it for this case

Michael Selik michael.selik at gmail.com
Thu Jun 9 18:44:47 EDT 2016


I like using Yelp's mrjob module (https://github.com/Yelp/mrjob) to run
Python on Hadoop.

On Thu, Jun 9, 2016 at 2:56 AM Ho Yeung Lee <davidbenny2000 at gmail.com>
wrote:

> [... a bunch of code ...]


If you want to describe a map-reduce problem, start with the data. What
does a record of your input data look like?

Then think about your mapper. What key-value pairs will you extract from
each line of data?

Then think about your reducer. For a single key and its associated values,
what will you calculate?



More information about the Python-list mailing list