编写高质量Python (第29条) 用赋值表达式消除推导中的重复代码

第 29 条用赋值表达式消除推导中的重复代码

推导 list、dict 与 set 等变体结构时，经常要在多个地方用到同一个计算结果。例如，我们要给制作紧固件等公司编写程序来管理订单。顾客下单后，我们要判断当前的库存能否满足这份订单，也就是说，要核查每种产品的数量有没有达到可以发货的最低限制（8 个为一批，至少要有一批，才能发货）。

stocks = {'nails': 125,'screw': 35,'wingnuts': 8,'washers': 24,
}order = ['screws', 'wingnuts', 'clips']def get_batches(count, size):return count // sizeresult = {}
for name in order:count = stocks.get(name, 0)batches = get_batches(count, 8)if batches:result[name] = batchesprint(result)       >>>
{'screws': 4, 'wingnuts': 1}

这段逻辑，如果改用字典来写，会简单一些（参见第 27 条）。

found = {name: get_batches(stocks.get(name, 0), 8)for name in orderif get_batches(stocks.get(name, 0), 8)}
print(found)>>>
{'screws': 4, 'wingnuts': 1}

这样写虽然比刚才简短，但问题是，它把 get_batches(stock.get(name, 0), 8)写成了两遍。这样写会让代码看着比较乱，而且实际上，程序也没有必要把这个运算结果计算两遍。另外，如果这两个地方都忘了同步更新，那么程序就会出现 bug。例如，我们决定每一批不是 8 个，而是 4 个，那么需要把 get_batches 的第二个参数从 8 改成 4，但是，万一我们忘了同步修改另一个地方，那么代码就会出现问题（它会把大于等于 4 但是小于 8 的情况给漏掉）。

has_bug = {name: get_batches(stocks.get(name, 0), 4)for name in orderif get_batches(stocks.get(name, 0), 8)}print('Expected:', found)
print('Found:', has_bug)>>>
Expected: {'screws': 4, 'wingnuts': 1}
Found: {'screws': 8, 'wingnuts': 2}

有个简单的办法可以解决这个问题，那就是在推导的过程中使用 Python 3.8 新引入的 := 操作符进行赋值表达（参见第10条）。

found = {name: batches for name in orderif (batches := get_batches(stocks.get(name, 0), 8))}

这条 batches := get_batches(…) 赋值表达式，能够从 stocks 字典中查到对应商品一共有几批，并且把这个批数放在 batches 变量里。这样的话，我们推导这个产品所对应批数时，就不用通过 get_batches 计算了，因为这样结果已经保存到 batches 里面了。这种方法只需要把 get 与 get_batches 调用一次即可，这样能够提高效率，因为我们不需要针对 order 列表中的每件产品都多做一次 get 与 get_batches。

在推导过程中，描述新值的那一部分也可以出现赋值表达式。但如果在其他部分引用了定义在那一部分的变量，那么程序就可能在运行时出错。例如，如果写成了下面这样，那么程序就需要先考虑 for name, count in stocks.items() if tenth > 0,而这个时候，其中的 teeth 时没有得到定义。

result = {name: (tenth := count // 10)for name, count in stocks.items() if tenth > 0}>>>
Traceback ...
NameError: name 'tenth' is not defined

如果推导逻辑不含条件，可以把赋值表达式移动到 if 条件里面，然后在描述新值的这一部分引用已经定义过的 tenth 变量。

result = {name: tenth for name, count in stocks.items() if (tenth := count // 10) > 0}>>>
{'nails': 12, 'screws': 3, 'washers': 2}

如果推导逻辑不带条件，而表示新值的那一部分又使用了 := 操作符，那么操作符左边的变量就会泄漏到包含这条推导语句的那个作用域里面。（参见第21条）。

half = [(last := count // 2) for count in stocks.values()]
print(f'Last item of {half} is {last}')>>>
Last item of [62, 17, 4, 12] is 12

这与普通的 for 循环所用的那个循环变量相似。

for count in stocks.values():  # Leaks loop variablepass
print(f'Last item of {list(stocks.values())} is {count}')

然而，推导语句中的 for 循环所使用的循环变量，是不会像刚才那样泄漏到外面的。

half = [count // 2 for count in stocks.values()]
print(half)
print(count)>>>
[62, 17, 4, 12]
Traceback ...
NameError: name 'count' is not defined. Did you mean: 'round'?

最好不要泄漏循环变量，所以，建议赋值语句只出现在推导逻辑的条件之中。

赋值表达式不仅可以用在推导过程中，而且可以用来编写生成器表达式（ generator expression，参见第 32 条）。下面这种写法创建的是迭代器，而不是字典实例，该迭代器会给出一对数值，其中第一个元素为产品的名字，第二个元素为这种产品的库存。

found = ((name, batches) for name in orderif (batches := get_batches(stocks.get(name, 0), 8)))
print(next(found))
print(next(found))>>>
('screws', 4)
('wingnuts', 1)