**(Status: this proposal will not go forward, but helped
spark a discussion on a better idea--generator expressions.)**

List displays (aka list comprehensions) have proven to be a useful and popular part of Python. However, there are limitations on what they can do, and they sometimes lead to inefficiencies: it is tempting to create a list of intermediate results, even when the whole list is not really needed. Consider finding the maximum termperature over the last day, where you have a function, temp, that takes an hour as argument and fetches the temperature. With list comprehensions this is:

max([temp(hour) for hour in range(24)])This builds up the list of temperatures, only to discard them. Not so bad in this case, but troublesome in cases when the list is multiple megabytes. In this note I propose a new syntax called

[Max: temp(hour) for hour in range(24)]I use "Max" rather than "max" because I want the accumulator to be different from the builtin function (they could be made the same, but I prefer not to for reasons we will soon see). Here are some more examples:

[Sum: x*x for x in numbers] [Product: Prob_spam(word) for word in email_msg] [Min: temp(hour) for hour in range(24)] [Mean: f(x) for x in data] [Median: f(x) for x in data] [Mode: f(x) for x in data] [SortBy: abs(x) for x in (-2, -4, 3, 1)]

"[" expression ":" listmaker "]"where

[Sum: x*x for x in numbers]we want that to be the equivalent of

total = 0 for x in it: total = (x*x + total) return totalThat seems simple enough: we initialize a value, update it in the loop, and then return it at the end. But it is not always quite that easy. For example, in

[Mean: f(x) for x in data]we want

(total, n) = (0, 0) for x in it: (total, n) = (total+x, n+1) return total / nIn this case we need some additional computation at the end, and the value we are updating is more complex (a two-element tuple rather than a single number). (And we have assumed true division.) We need to make some choices: let us say that an accumulator (such as

[acc: exp for x in it]is equivalent to

accumulation(acc, lambda x: exp, it)where we have defined

def accumulation(acc, f, it): a = acc() for x in it: a.add(f(x)) return a.result()We can now define Mean as

class Mean: def __init__(self): self.total, self.n = 0, 0 def add(self, exp): self.total, self.n = self.total+exp, self.n+1 def result(self): return self.total / self.n

[Argmax: votes(c) for c in candidates](In math "argmax" (or "arg max") gives you the element of a set with the maximal value on some function (whereas "max" gives you the maximal function value itself).) Here we are asking for the element c with the maximal value of votes(c). We can program this by adding the three characters ", x" to the definition of accumulation:

def accumulation(acc, f, it): a = acc() for x in it: a.add(f(x), x) return a.result()

[Some: temp(hour) > 30 for hour in range(24)]We want this to return a true value if there is some hour with temp(hour) > 30, and False if there is not. Obviously it can stop iterating when it finds the first element that satisfies the condition. We will handle this by having the accumulator's add method return a true value if it wants to stop the loop. After changing

def accumulation(acc, f, it): a = acc() for x in it:we can defineifa.add(f(x), x):breakreturn a.result()

class Some: def __init__(self): self.value = False def add(self, exp, _): if exp: self.value = True return StopIteration def result(self): return self.valueNote I returned

[Top(10): humor(joke) for joke in jokes]Up to now, accumulators were passed no arguments when they were initialized, implicitly, by the function

def accumulation(acc, f, it):Now we can defineif isinstance(acc, (types.ClassType, types.TypeType)): a = acc() else: a = accfor x in it: if a.add(f(x), x): break return a.result()

from heapq import * class Top: def __init__(self, k): self.k = k self.heap = [] def add(self, exp, _): heap = self.heap if len(heap) < self.k: heappush(heap, exp) elif exp < heap.biggest(): heapreplace(heap, exp) def result(self): return self.heapNote that this can be much more efficient than alternatives such as:

jokes.sort(lambda a, b: humor(a) < humor(b)) jokes[0:10]

a = Top(10) for j in new_jokes: print j a.add(float(raw_input("Rating?")), j) return [a: humor(j) for j in machine_ratable_jokes]Here we need a

Another way to look at it is that there are some common algorithms in programming, such as finding the best element(s) or value(s) from a candidate set, that appear again and again, and this allows us to abstract out these algorithms, rather than re-implement them every time.

One drawback of my proposal is that the `[ ... ]` syntax
suggests that a list is returned, when that need not be the case.
My defense is that it is iterating over a list (or at least, an iterable).
But Guido pronounced, quite reasonably, that this was not a sufficient defense.

There have been past discussions of set displays and dict displays, using syntax something like:

words = { w.lower() for w in text.split() } squares = { i:i*i for i in range(N) }I'm not going to take a stand on whether these are good ideas, but I note you can do this with accumulation displays easily:

words = [Set: w.lower() for w in text.split()] squares = [Dict: i, i*i for i in range(N)](or of course you can do it without any changes to the language):

words = set([w.lower() for w in text.split()]) squares = dict([i, i*i for i in range(N)])Discussion of my proposal led to the ressurection of PEP 289 for generator comprehensions, slightly updated and now called

exp for x in itwould be roughly equivalent to

def gen(): for x in it: yield exp return gen()You get short-circuit evaluation just by ceasing to call the generator. In fact, all the accumulators that I wrote could be re-written as functions that take an generator/iterator/sequence as input (although if you want both the loop variable and the computed value, you need to pass them both as a tuple). Compare:

Accumulation Display | Generator Expression plus Function |
---|---|

[Sum: x*x for x in numbers] [Product: Prob_spam(word) for word in email_msg] [Min: temp(hour) for hour in range(24)] [Median: f(x) for x in data] [SortBy: abs(x) for x in (-2, -4, 3, 1)] [Top(10): humor(joke) for joke in jokes] [Argmax: votes(c) for c in candidates] |
sum(x*x for x in numbers) product(Prob_spam(word) for word in email_msg) min(temp(hour) for hour in range(24)) median(f(x) for x in data) sortdecorated((abs(x),x) for x in (-2, -4, 3, 1)) top(10, (humor(joke) for joke in jokes)) argmax((c,votes(c)) for c in candidates) |

With the accumulation display approach, you need a change to the
language, and a new set of accumulators. With the generator expression
approach you need a change to the language of similar complexity, and
a combination of old funcxtions (sum, min) and new functions (product,
top, argmax). The new functions are somewhat easier to write than
the accumulators; for example a 6-line `some` function instead
of an 8-line `Some` accumulator:

def some(items): "If some element in items is true, return it; otherwise return False" for item in items: if item: return item return False

Philip J. Eby and Raymond Hettinger proposed the generator expression idea. I'm happy that my proposal came at the right time to help spur a discussion that appears to be on the way to a happy ending--I think the generator expression idea is more generally useful than my proposal.

Other points were raised in the discussion on python-dev.
Ian Bicking made a suggestion that I interpreted to mean that `accumulation` should be implemented as something like:

def accumulation(acc, f, it):The advantage is thata = acc.__accum__()for x in it: if a.add(f(x), x): break return a.result()

- A library of accumulators. I'm thinking there should be
an
`accum`module, which users can choose to import or import * from. I have a candidate accum module. - A demonstration of the use of these accumulators.
I have one called testaccum
which rewrites a limited class of accumulation displays into
calls to
`accumulation`and prints the results. It produces this output:temp = [70, 70, 71, 74, 76, 76, 72, 76, 77, 77, 77, 78, 78, 79, 79, 79, 78, 80, 82, 83, 83, 81, 84, 83] data = temp votes = {'Gray': 45, 'Arnie': 48, 'Peter': 3, 'Cruz': 32, 'Tom': 13} candidates = ['Gray', 'Arnie', 'Peter', 'Cruz', 'Tom'] [Max: temp[hour] for hour in range(24)] ==> 84 [Min: temp[hour] for hour in range(24)] ==> 70 [Sum: x*x for x in data] ==> 144999 [Mean: f(x) for x in data] ==> 155.25 [Median: f(x) for x in data] ==> 156.0 [Mode: f(x) for x in data] ==> 166 [Argmax: votes[c] for c in candidates] ==> Arnie [Argmin: votes[c] for c in candidates] ==> Peter [Some: temp[hour] > 75 for hour in range(24)] ==> True [Every: temp[hour] > 75 for hour in range(24)] ==> False [Top(10): temp[hour] for hour in range(24)] ==> [84, 83, 83, 83, 82, 81, 80, 79, 79, 79] [Join(', '): votes[c] for c in candidates] ==> 45, 48, 3, 32, 13 [SortBy: abs(x) for x in (-2,-4, 3, 1)] ==> [1, -2, 3, -4] [SortBy(reverse=True): abs(x) for x in (-2, -4, 3, 1)] ==> [-4, 3, -2, 1]

- Changing the grammar would have been rather easy. Where the
grammar
now has:
listmaker ::= expression ( list_for | ( "," expression )* [","] )

we could change it to:listmaker ::= expression [ ":" expression ] ( list_for | ( "," expression )* [","] )

- Changing the code generator would have been simple for the simple examples shown here, but would require some thought for complex
*listmakers*with multiple`for`and`if`clauses. Basically, we want[acc: exp for x in it1 if c1 for y in it2 if c2 ...]

to meanfrom itertools import ifilter accumulation(acc, lambda (x, y, ...): exp, icross_product(ifilter(c1, it1), ifilter(c2, it2), ...))

where`icross_product`takes the cross product of all the argument iterables, in left-to-right order. That is, it generates tuples in the order(c1[0], c2[0], ... cn[0]) (c1[0], c2[0], ... cn[1]) (c1[0], c2[0], ... cn[2]) ... (c1[0], c2[0], ... cn[-1]) (c1[0], c2[0], ... cn_1[1], cn[0]) (c1[0], c2[0], ... cn_1[1], cn[1]) ... (c1[-1], c2[-1], ... cn[-1])

Perhaps it should be a standard part of`itertools`.

This document is in the public domain and may be used for any purpose.