Map, Filter, and Reduce are three functions that promote a functional programming approach. We'll go over these and Lambda, in this python programming tutorial.
If you're unable to install Python, you also have the option to use the browser-embedded python REPL below. Go ahead and type right inside that code window. To make it more convenient, I recommend that you have two windows of this page running side-by-side. This allows you to run the code in one window, while scrolling through the article in another.
The map function accepts a function as its first parameter, and a list as its second, applying that function to each element of the list and returning a new list with those new values.
>>> l = [4, 5, 1, 8] >>> def double(x): ... return x * 2 ... >>> map(double, l) [8, 10, 2, 16]
filter also accepts a function and a list. When you pass a function into filter, that function is expected to return
False. So as filter iterates through each element of the list, it's passing the value into the function, and if it returns true, it will append that item into a new list. The end result is a list with all of those passing values.
>>> l = [2, 6, 5, 8, 7] >>> def is_even(x): ... return (x % 2) == 0 ... >>> filter(is_even, l) [2, 6, 8]
reduce follows the same paradigm as the preceding functions. It accepts a function and a list, but returns a total representative of those values based on the logic provided through the passed in function.
>>> l = [1, 2, 4, 9, 5] >>> def add(a, b): ... return a + b ... >>> def sub(a, b): ... return a - b ... >>> reduce(add, l) 21 >>> reduce(sub, l) -19 # You can also provide an initial value reduce(add, l, 4) # Initial value of 4 results in 25
lambda in Python as the construct that supports the creation of anonymous functions at runtime.
# defining a function normally >>> def sqr(x): ... return x*x ... >>> print(sqr(3)) 9 # defining a lambda function >>> l = lambda x: x*x >>> print(l(3)) 9 # lambda in list comprehensions >>> print([(lambda x: x*x)(i) for i in range(10)]) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] # a cleaner solution >>> print([i*i for i in range(10)]) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
So you can see that all of these functions are driving towards the same basic idea. Major efficiencies result in distributing tasks over several clusters of data, rather than all at once. The MapReduce model also follows this principle, making giant strides in the way of processing and generating large data sets. There's the map phase and the reduce phase. The map phase is responsible for accepting a set of inputs and generating a new variable number of outputs determined by a provided function and the programmer's intent. The reducer's job is to process the data from the mapper into something useable. Somewhere along this process, there's also a shuffle and sort phase that makes it easier to manage the data, but we won't dive into that given the scope of this article.
So now it's easy to see how MapReduce was influenced by the map and reduce functions. MapReduce's main contributions lie in the way of distribution, parallelism, redundancy and fault tolerance, and, most importantly, scalability.
- Fluent Python: Clear, Concise, and Effective Programming
- Python Cookbook: Recipes for Mastering Python 3
- Introducing Python: Modern Computing in Simple Packages
- Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development Series)
- Head First Python: A Brain-Friendly Guide