Generators were first introduced to Python 2.2. Also referred to as "weightless threads", they allow you to replace threads or processes. Creation, entry and return are virtually free, unlike the alternatives, and encourages an asynchronous approach to handling background events. However, generators are single-threaded, and normally don't perform as well with intensive, blocking operations. We'll be going over the fundamentals, in this article. If, after reading this, you're interested in learning more, I highly recommend David Beazley and Brian K. Jones' Python Cookbook, Third edition.
If you're unable to install Python, you also have the option to use the browser-embedded python REPL below. Go ahead and type right inside that code window. To make it more convenient, I recommend that you have two windows of this page running side-by-side. This allows you to run the code in one window, while scrolling through the article in another.
After going through the prerequisites, you should be familiar with iterators. Generators are basically the same thing. Generators generate values as they go and do not store all the values in memory. Take the following Python3 example:
>>> gen = (x*2 for x in range(5)) >>> for i in gen: ... print(i) ... 0 2 4 6 8
Unlike an iterator, there is no list to be returned. Generators just step through each iteration, and return each value, one by one. Also notice you're required to use parenthesis instead of square brackets. Generators usually run faster than regular iterations with the same end result. You also save a lot of processing because you don't have to incur any computations beyond what you've specified in your call to the generator.
yield is really all you need to define a generator. As long as
yield exists somewhere inside your function, calling it will return a generator.
>>> def generator(): ... yield ... >>> gen = generator() >>> print(gen) <generator object generator at 0x101107fa0>
yield is a keyword, just like
yield will return a generator at the point it was declared, and continue through the iteration where it left off during the next step. So whereas a
return statement permanently hands over control to the caller of the function at the end,
yield does so, temporarily, as it goes. The benefit of this is no longer having to keep track of state between calls or having to return large in-memory values at the end of a function call. You return values at each step of the iteration or by calling
next() on the generator. This is best illustrated with an example:
Example using Python List Comprehensions
# normal function which returns a list >>> def get_even(start): ... l =  ... for i in range(start): ... if i % 2 == 0: ... l.append(i) ... return l >>> print(get_even(10)) [2,4,6,8] # list comprehension which returns a list >>> print([i for i in range(0,9) if i % 2 == 0]) [0, 2, 4, 6, 8] <h3>Example using Python Generators</h3> ```python >>> def is_even(num): ... return num % 2 == 0 ... # generator which yields each value >>> def get_even(cap): ... for i in range(cap): ... if is_even(i): ... yield i # create the generator >>> gen = get_even(10) # call the generator with a loop... >>> for i in gen: ... print(i) 0 2 4 6 8 # or explicitly one by one... # using Python 2.7 >>> print(gen.next()) 0 >>> print(gen.next()) 2 # using Python 3.x >>> print(next(gen)) 0 >>> print(next(gen)) 2
So in the above example, we're just printing each value that gets returned with the generator, but of course you can pass these values into other functions. You also have more flexibility over the yielded values' types. So rather than explicitly defining all of the return types across multiple functions, generators allow you to take a more elegant approach, at the same level of performance.
# normal function which returns a list >>> def get_even_list(start): ... l =  ... for i in range(start): ... if i % 2 == 0: ... l.append(i) ... return l ... >>> print(get_even_list(10)) [0, 2, 4, 6, 8] # normal function which returns a tuple >>> def get_even_tuple(start): ... l = () ... for i in range(start): ... if i % 2 == 0: ... l = l + (i,) ... return l ... >>> print(get_even_tuple(10)) (0, 2, 4, 6, 8) # Note that we need two different functions, unless we cast print(tuple(get_even_list(10))) # cast list to tuple
Because tuples are immutable, we can't append to them like we do with lists. We have to create a new tuple by adding new tuples to the previous tuple like in our example, above:
l = l + (i,)
A More Elegant Solution Using Python Generators
>>> def is_even(num): ... return num % 2 == 0 ... # generator which yields each value # no need to indicate the return value's type... # ...in the generator, not even once >>> def get_even(cap): ... for i in range(cap): ... if is_even(i): ... yield i ... # call the generator with islice >>> from itertools import islice >>> print(tuple(islice(get_even(100), 100))) (0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98)
send() on a generator, you're able to pass values as you iterate:
## declare the generator with an infinite loop... ## ...that stores sent values >>> def generator(): ... while True: ... next = yield ... print("Next:", next) ... >>> gen = generator() # instantiate the generate >>> print(gen) <generator object generator at 0x7f12fd300f50> # enter into the first yield >>> next(gen) # send values as you go >>> gen.send('a'); ('Next:', 'a') >>> gen.send('b'); ('Next:', 'b')
You can also terminate generators by calling
close() or throwing exceptions with
# Close >>> def generator(): ... try: ... yield ... except GeneratorExit: ... print("Terminating") ... >>> gen = generator() >>> next(gen) >>> gen.close() Terminating # Throw >>> gen = generator() >>> next(gen) >>> gen.throw(RuntimeError, "Something went wrong") Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: Something went wrong
Generators may also chain operations using the
yield from <iterable>, syntax. Think of this as shorthand for
for i in iterable: yield i:.
# chaining >>> def generator(n): ... yield from range(n) ... yield from range(n) ... >>> print(list(generator(3))) # [0,1,2,0,1,2]
yield from also has another added benefit over for loops by allowing subgenerators to receive sent values and thrown exceptions directly from the calling scope, and return a final value to the outer generator. This next example revisits some of what we learned about
# generator 1 >>> def counter(): ... ctr = 0 ... while True: ... next = yield ... if next is None: ... return ctr ... ctr += next ... # generator 2 >>> def store_totals(totals): ... while True: ... ctr = yield from counter() ... totals.append(ctr) ... >>> totals =  # the list we'll pass to the generator >>> total = store_totals(totals) >>> next(total) # get ready to yield >>> for i in range(5): ... total.send(i) # send the values to be totaled up... ... >>> total.send(None) # ...and make sure stop the generator >>> for i in range(3): ... total.send(i) # start back up again... ... >>> total.send(None) # ...and finish the second count >>> print(totals) # [10, 3]
As you've learned in previous sections, all it takes to make a function into a generator is the
yield keyword. So we have two generators in this example, with the second generator delegating to the first.
That concludes this article. I strongly recommend you watch David Beazley's Generators: A Final Frontier, if you'd like to learn some more of the advanced topics.
- Fluent Python: Clear, Concise, and Effective Programming
- Python Cookbook: Recipes for Mastering Python 3
- Introducing Python: Modern Computing in Simple Packages
- Effective Python: 59 Specific Ways to Write Better Python (Effective Software Development Series)
- Head First Python: A Brain-Friendly Guide