List Comprehensions and Generator Expressions

In Python, list comprehensions and generator expressions have a similar syntax. However, they produce different object types. In the first case, this is a list, in the second - it's a generator object.

Generator objects are special function objects that retain their state between calls. In the for loop, they behave like iterable objects (lists, dictionaries, strings, etc). However, generators support the __next __() method, which means they are one type of iterator.

First we look at list comprehensions to get used to the syntax.

List comprehensions

In Python, list comprehensions allow to create and quickly populate lists.

The syntactic construction assumes the presence of an iterable object or an iterator, on the basis of which a new list will be created. There should also be an expression that will do something with the items extracted from the sequence before adding them to the list being created.

>>> a = [1, 2, 3]
>>> b = [i+10 for i in a]
>>> a
[1, 2, 3]
>>> b
[11, 12, 13]

In the example above, the list comprehensions is the expression [i+10 for i in a]. Here a is an iterable object. In this case, this is a different list. Each item is extracted from it in the for loop. Before for, an action is described that is performed on the item before it is added to the new list.

Note that a list comprehension creates a new list, but does not modify the existing one. If you need to change the current variable, you need to assign it a new value:

>>> a = [1, 2, 3]
>>> a = [i+10 for i in a]
>>> a
[11, 12, 13]

List comprehensions are classified as "syntax sugar" in the Python programming language. In other words, you can do without them:

>>> for index, value in enumerate(a):
...     a[index] = value + 10
...
>>> a
[11, 12, 13]

If the program may contain several links to the list, the list comprehensions should be used carefully:

>>> ls0 = [1,2,3]
>>> ls1 = ls0
>>> ls1.append(4)
>>> ls0
[1, 2, 3, 4]
>>> ls1 = [i+1 for i in ls1]
>>> ls1
[2, 3, 4, 5]
>>> ls0
[1, 2, 3, 4]

Here we assume that changing the list through one variable will be visible through the other. However, if you change the list with a list comprehension, the variables will point to different lists.

The object in the for loop may not be just a list. In the example below, the file lines are placed in the list.

>>> lines = [line.strip() for line in open('text.txt')]
>>> lines
['one', 'two', 'three']

You can add a condition to the list comprehention:

>>> from random import randint
>>> nums = [randint(10, 20) for i in range(10)]
>>> nums
[18, 17, 11, 11, 15, 18, 11, 20, 10, 19]
>>> nums = [i for i in nums if i%2 == 0]
>>> nums
[18, 18, 20, 10]

Nested loops are allowed:

>>> a = "12"
>>> b = "3"
>>> c = "456"
>>> comb = [i+j+k for i in a for j in b for k in c]
>>> comb
['134', '135', '136', '234', '235', '236']

Dictionaries and sets comprehensions

If you replace the brackets with curly brackets, you can get a dictionary, not a list:

>>> a = {i:i**2 for i in range(11,15)}
>>> a
{11: 121, 12: 144, 13: 169, 14: 196}

The syntax of the expression before the for must be appropriate to the dictionary, that is, include the key and the value after a colon. If not, a set will be generated:

>>> a = {i for i in range(11,15)}
>>> a
set([11, 12, 13, 14])
>>> b = {1, 2, 3}
>>> b
set([1, 2, 3])

Generator expressions

Expressions that create generator objects are similar to expressions that generate lists, dictionaries, and sets, with one exception. To create a generator object, you must use parentheses:

>>> a = (i for i in range(2, 8))
>>> a
<generator object <genexpr> at 0x7efc88787910>
>>> for i in a:
...     print(i)
...
2
3
4
5
6
7

The second time to iterate over the generator does not work, since the generator object has already generated all the values using the “formula” embedded in it. Therefore, generators are usually used when you have to walk once through an iterator object.

In addition, generators save memory, because not all values are stored in it, but only the previous item, the limit, and the formula by which the next element is calculated.

A generator expression is an abbreviated notation of the following:

>>> def func(start, finish):
...     while start < finish:
...             yield start * 0.33
...             start += 1
...
>>> a = func(1, 4)
>>> a
<generator object func at 0x7efc88787a50>
>>> for i in a:
...     print(i)
...
0.33
0.66
0.99

The function containing yield returns a generator object, rather than executing its code immediately. The function body is executed each time the __next __() method is called. In the for loop, this is done automatically. At the same time, the values of variables from the previous call are saved.

If there is no need to use the function repeatedly, it is easier to use the expression:

>>> b = (i*0.33 for i in range(1,4))
>>> b
<generator object <genexpr> at 0x7efc88787960>
>>> for i in b:
...     print(i)
...
0.33
0.66
0.99