| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \section{\module{itertools} --- | 
					
						
							|  |  |  |          Functions creating iterators for efficient looping} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \declaremodule{standard}{itertools} | 
					
						
							|  |  |  | \modulesynopsis{Functions creating iterators for efficient looping.} | 
					
						
							|  |  |  | \moduleauthor{Raymond Hettinger}{python@rcn.com} | 
					
						
							|  |  |  | \sectionauthor{Raymond Hettinger}{python@rcn.com} | 
					
						
							|  |  |  | \versionadded{2.3} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This module implements a number of iterator building blocks inspired | 
					
						
							|  |  |  | by constructs from the Haskell and SML programming languages.  Each | 
					
						
							|  |  |  | has been recast in a form suitable for Python. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | The module standardizes a core set of fast, memory efficient tools | 
					
						
							|  |  |  | that are useful by themselves or in combination.  Standardization helps | 
					
						
							|  |  |  | avoid the readability and reliability problems which arise when many | 
					
						
							|  |  |  | different individuals create their own slightly varying implementations, | 
					
						
							|  |  |  | each with their own quirks and naming conventions. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  | The tools are designed to combine readily with one another.  This makes | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | it easy to construct more specialized tools succinctly and efficiently | 
					
						
							|  |  |  | in pure Python. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  | For instance, SML provides a tabulation tool: \code{tabulate(f)} | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | which produces a sequence \code{f(0), f(1), ...}.  This toolbox | 
					
						
							|  |  |  | provides \function{imap()} and \function{count()} which can be combined | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  | to form \code{imap(f, count())} and produce an equivalent result. | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-04-23 00:09:42 +00:00
										 |  |  | Likewise, the functional tools are designed to work well with the | 
					
						
							|  |  |  | high-speed functions provided by the \refmodule{operator} module. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The module author welcomes suggestions for other basic building blocks | 
					
						
							|  |  |  | to be added to future versions of the module. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  | Whether cast in pure python form or compiled code, tools that use iterators | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | are more memory efficient (and faster) than their list based counterparts. | 
					
						
							|  |  |  | Adopting the principles of just-in-time manufacturing, they create | 
					
						
							|  |  |  | data when and where needed instead of consuming memory with the | 
					
						
							|  |  |  | computer equivalent of ``inventory''. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-04-23 00:09:42 +00:00
										 |  |  | The performance advantage of iterators becomes more acute as the number | 
					
						
							|  |  |  | of elements increases -- at some point, lists grow large enough to | 
					
						
							| 
									
										
										
										
											2003-09-22 15:00:55 +00:00
										 |  |  | severely impact memory cache performance and start running slowly. | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{seealso} | 
					
						
							|  |  |  |   \seetext{The Standard ML Basis Library, | 
					
						
							|  |  |  |            \citetitle[http://www.standardml.org/Basis/] | 
					
						
							|  |  |  |            {The Standard ML Basis Library}.} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \seetext{Haskell, A Purely Functional Language, | 
					
						
							|  |  |  |            \citetitle[http://www.haskell.org/definition/] | 
					
						
							|  |  |  |            {Definition of Haskell and the Standard Libraries}.} | 
					
						
							|  |  |  | \end{seealso} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \subsection{Itertool functions \label{itertools-functions}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The following module functions all construct and return iterators. | 
					
						
							|  |  |  | Some provide streams of infinite length, so they should only be accessed | 
					
						
							|  |  |  | by functions or loops that truncate the stream. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  | \begin{funcdesc}{chain}{*iterables} | 
					
						
							|  |  |  |   Make an iterator that returns elements from the first iterable until | 
					
						
							|  |  |  |   it is exhausted, then proceeds to the next iterable, until all of the | 
					
						
							|  |  |  |   iterables are exhausted.  Used for treating consecutive sequences as | 
					
						
							|  |  |  |   a single sequence.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def chain(*iterables): | 
					
						
							|  |  |  |          for it in iterables: | 
					
						
							|  |  |  |              for element in it: | 
					
						
							|  |  |  |                  yield element | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \begin{funcdesc}{count}{\optional{n}} | 
					
						
							|  |  |  |   Make an iterator that returns consecutive integers starting with \var{n}. | 
					
						
							| 
									
										
										
										
											2003-12-07 13:00:25 +00:00
										 |  |  |   If not specified \var{n} defaults to zero.   | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   Does not currently support python long integers.  Often used as an | 
					
						
							|  |  |  |   argument to \function{imap()} to generate consecutive data points. | 
					
						
							| 
									
										
										
										
											2003-08-08 02:40:28 +00:00
										 |  |  |   Also, used with \function{izip()} to add sequence numbers.  Equivalent to: | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def count(n=0): | 
					
						
							|  |  |  |          while True: | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  |              yield n | 
					
						
							|  |  |  |              n += 1 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   \end{verbatim} | 
					
						
							| 
									
										
										
										
											2003-02-07 05:32:58 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |   Note, \function{count()} does not check for overflow and will return | 
					
						
							|  |  |  |   negative numbers after exceeding \code{sys.maxint}.  This behavior | 
					
						
							|  |  |  |   may change in the future. | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  | \begin{funcdesc}{cycle}{iterable} | 
					
						
							|  |  |  |   Make an iterator returning elements from the iterable and saving a | 
					
						
							|  |  |  |   copy of each.  When the iterable is exhausted, return elements from | 
					
						
							|  |  |  |   the saved copy.  Repeats indefinitely.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def cycle(iterable): | 
					
						
							|  |  |  |          saved = [] | 
					
						
							|  |  |  |          for element in iterable: | 
					
						
							|  |  |  |              yield element | 
					
						
							|  |  |  |              saved.append(element) | 
					
						
							| 
									
										
										
										
											2003-08-08 02:40:28 +00:00
										 |  |  |          while saved: | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  |              for element in saved: | 
					
						
							|  |  |  |                    yield element | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  |   Note, this member of the toolkit may require significant | 
					
						
							|  |  |  |   auxiliary storage (depending on the length of the iterable). | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \begin{funcdesc}{dropwhile}{predicate, iterable} | 
					
						
							|  |  |  |   Make an iterator that drops elements from the iterable as long as | 
					
						
							|  |  |  |   the predicate is true; afterwards, returns every element.  Note, | 
					
						
							|  |  |  |   the iterator does not produce \emph{any} output until the predicate | 
					
						
							|  |  |  |   is true, so it may have a lengthy start-up time.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def dropwhile(predicate, iterable): | 
					
						
							|  |  |  |          iterable = iter(iterable) | 
					
						
							| 
									
										
										
										
											2003-08-08 02:40:28 +00:00
										 |  |  |          for x in iterable: | 
					
						
							|  |  |  |              if not predicate(x): | 
					
						
							|  |  |  |                  yield x | 
					
						
							|  |  |  |                  break | 
					
						
							|  |  |  |          for x in iterable: | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |              yield x | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  | \begin{funcdesc}{groupby}{iterable\optional{, key}} | 
					
						
							|  |  |  |   Make an iterator that returns consecutive keys and groups from the | 
					
						
							| 
									
										
										
										
											2004-07-11 13:20:11 +00:00
										 |  |  |   \var{iterable}. The \var{key} is a function computing a key value for each | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  |   element.  If not specified or is \code{None}, \var{key} defaults to an | 
					
						
							| 
									
										
										
										
											2003-12-06 22:29:43 +00:00
										 |  |  |   identity function and returns  the element unchanged.  Generally, the | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  |   iterable needs to already be sorted on the same key function. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   The returned group is itself an iterator that shares the underlying | 
					
						
							|  |  |  |   iterable with \function{groupby()}.  Because the source is shared, when | 
					
						
							|  |  |  |   the \function{groupby} object is advanced, the previous group is no | 
					
						
							|  |  |  |   longer visible.  So, if that data is needed later, it should be stored | 
					
						
							|  |  |  |   as a list: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |     groups = [] | 
					
						
							|  |  |  |     uniquekeys = [] | 
					
						
							|  |  |  |     for k, g in groupby(data, keyfunc): | 
					
						
							|  |  |  |         groups.append(list(g))      # Store group iterator as a list | 
					
						
							|  |  |  |         uniquekeys.append(k) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \function{groupby()} is equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |     class groupby(object): | 
					
						
							|  |  |  |         def __init__(self, iterable, key=None): | 
					
						
							|  |  |  |             if key is None: | 
					
						
							|  |  |  |                 key = lambda x: x | 
					
						
							|  |  |  |             self.keyfunc = key | 
					
						
							|  |  |  |             self.it = iter(iterable) | 
					
						
							|  |  |  |             self.tgtkey = self.currkey = self.currvalue = xrange(0) | 
					
						
							|  |  |  |         def __iter__(self): | 
					
						
							|  |  |  |             return self | 
					
						
							|  |  |  |         def next(self): | 
					
						
							|  |  |  |             while self.currkey == self.tgtkey: | 
					
						
							|  |  |  |                 self.currvalue = self.it.next() # Exit on StopIteration | 
					
						
							|  |  |  |                 self.currkey = self.keyfunc(self.currvalue) | 
					
						
							|  |  |  |             self.tgtkey = self.currkey | 
					
						
							|  |  |  |             return (self.currkey, self._grouper(self.tgtkey)) | 
					
						
							|  |  |  |         def _grouper(self, tgtkey): | 
					
						
							|  |  |  |             while self.currkey == tgtkey: | 
					
						
							|  |  |  |                 yield self.currvalue | 
					
						
							|  |  |  |                 self.currvalue = self.it.next() # Exit on StopIteration | 
					
						
							|  |  |  |                 self.currkey = self.keyfunc(self.currvalue) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  |   \versionadded{2.4} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  | \begin{funcdesc}{ifilter}{predicate, iterable} | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   Make an iterator that filters elements from iterable returning only | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |   those for which the predicate is \code{True}. | 
					
						
							|  |  |  |   If \var{predicate} is \code{None}, return the items that are true. | 
					
						
							|  |  |  |   Equivalent to: | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |      def ifilter(predicate, iterable): | 
					
						
							|  |  |  |          if predicate is None: | 
					
						
							| 
									
										
										
										
											2003-10-20 17:01:07 +00:00
										 |  |  |              predicate = bool | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |          for x in iterable: | 
					
						
							|  |  |  |              if predicate(x): | 
					
						
							|  |  |  |                  yield x | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{ifilterfalse}{predicate, iterable} | 
					
						
							|  |  |  |   Make an iterator that filters elements from iterable returning only | 
					
						
							|  |  |  |   those for which the predicate is \code{False}. | 
					
						
							|  |  |  |   If \var{predicate} is \code{None}, return the items that are false. | 
					
						
							|  |  |  |   Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def ifilterfalse(predicate, iterable): | 
					
						
							|  |  |  |          if predicate is None: | 
					
						
							| 
									
										
										
										
											2003-10-20 17:01:07 +00:00
										 |  |  |              predicate = bool | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |          for x in iterable: | 
					
						
							|  |  |  |              if not predicate(x): | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |                  yield x | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{imap}{function, *iterables} | 
					
						
							|  |  |  |   Make an iterator that computes the function using arguments from | 
					
						
							|  |  |  |   each of the iterables.  If \var{function} is set to \code{None}, then | 
					
						
							|  |  |  |   \function{imap()} returns the arguments as a tuple.  Like | 
					
						
							|  |  |  |   \function{map()} but stops when the shortest iterable is exhausted | 
					
						
							|  |  |  |   instead of filling in \code{None} for shorter iterables.  The reason | 
					
						
							|  |  |  |   for the difference is that infinite iterator arguments are typically | 
					
						
							|  |  |  |   an error for \function{map()} (because the output is fully evaluated) | 
					
						
							|  |  |  |   but represent a common and useful way of supplying arguments to | 
					
						
							|  |  |  |   \function{imap()}. | 
					
						
							|  |  |  |   Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def imap(function, *iterables): | 
					
						
							|  |  |  |          iterables = map(iter, iterables) | 
					
						
							|  |  |  |          while True: | 
					
						
							|  |  |  |              args = [i.next() for i in iterables] | 
					
						
							|  |  |  |              if function is None: | 
					
						
							|  |  |  |                  yield tuple(args) | 
					
						
							|  |  |  |              else: | 
					
						
							|  |  |  |                  yield function(*args) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{islice}{iterable, \optional{start,} stop \optional{, step}} | 
					
						
							|  |  |  |   Make an iterator that returns selected elements from the iterable. | 
					
						
							|  |  |  |   If \var{start} is non-zero, then elements from the iterable are skipped | 
					
						
							|  |  |  |   until start is reached.  Afterward, elements are returned consecutively | 
					
						
							|  |  |  |   unless \var{step} is set higher than one which results in items being | 
					
						
							| 
									
										
										
										
											2003-05-02 19:44:20 +00:00
										 |  |  |   skipped.  If \var{stop} is \code{None}, then iteration continues until | 
					
						
							|  |  |  |   the iterator is exhausted, if at all; otherwise, it stops at the specified | 
					
						
							|  |  |  |   position.  Unlike regular slicing, | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   \function{islice()} does not support negative values for \var{start}, | 
					
						
							|  |  |  |   \var{stop}, or \var{step}.  Can be used to extract related fields | 
					
						
							|  |  |  |   from data where the internal structure has been flattened (for | 
					
						
							|  |  |  |   example, a multi-line report may list a name field on every | 
					
						
							|  |  |  |   third line).  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def islice(iterable, *args): | 
					
						
							| 
									
										
										
										
											2003-05-02 19:44:20 +00:00
										 |  |  |          s = slice(*args) | 
					
						
							| 
									
										
										
										
											2003-08-08 02:40:28 +00:00
										 |  |  |          next, stop, step = s.start or 0, s.stop, s.step or 1 | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |          for cnt, element in enumerate(iterable): | 
					
						
							|  |  |  |              if cnt < next: | 
					
						
							|  |  |  |                  continue | 
					
						
							| 
									
										
										
										
											2003-05-02 19:04:37 +00:00
										 |  |  |              if stop is not None and cnt >= stop: | 
					
						
							| 
									
										
										
										
											2003-02-09 06:40:58 +00:00
										 |  |  |                  break | 
					
						
							|  |  |  |              yield element | 
					
						
							| 
									
										
										
										
											2003-05-02 19:04:37 +00:00
										 |  |  |              next += step              | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{izip}{*iterables} | 
					
						
							|  |  |  |   Make an iterator that aggregates elements from each of the iterables. | 
					
						
							|  |  |  |   Like \function{zip()} except that it returns an iterator instead of | 
					
						
							|  |  |  |   a list.  Used for lock-step iteration over several iterables at a | 
					
						
							|  |  |  |   time.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def izip(*iterables): | 
					
						
							|  |  |  |          iterables = map(iter, iterables) | 
					
						
							| 
									
										
										
										
											2003-08-08 05:10:41 +00:00
										 |  |  |          while iterables: | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |              result = [i.next() for i in iterables] | 
					
						
							|  |  |  |              yield tuple(result) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							| 
									
										
										
										
											2003-08-08 05:10:41 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |   \versionchanged[When no iterables are specified, returns a zero length | 
					
						
							|  |  |  |                   iterator instead of raising a TypeError exception]{2.4}   | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  | \begin{funcdesc}{repeat}{object\optional{, times}} | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  |   Make an iterator that returns \var{object} over and over again. | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  |   Runs indefinitely unless the \var{times} argument is specified. | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   Used as argument to \function{imap()} for invariant parameters | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  |   to the called function.  Also used with \function{izip()} to create | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   an invariant part of a tuple record.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2003-02-23 04:40:07 +00:00
										 |  |  |      def repeat(object, times=None): | 
					
						
							|  |  |  |          if times is None: | 
					
						
							|  |  |  |              while True: | 
					
						
							|  |  |  |                  yield object | 
					
						
							|  |  |  |          else: | 
					
						
							|  |  |  |              for i in xrange(times): | 
					
						
							|  |  |  |                  yield object | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{starmap}{function, iterable} | 
					
						
							|  |  |  |   Make an iterator that computes the function using arguments tuples | 
					
						
							|  |  |  |   obtained from the iterable.  Used instead of \function{imap()} when | 
					
						
							|  |  |  |   argument parameters are already grouped in tuples from a single iterable | 
					
						
							|  |  |  |   (the data has been ``pre-zipped'').  The difference between | 
					
						
							| 
									
										
										
										
											2003-02-21 01:45:34 +00:00
										 |  |  |   \function{imap()} and \function{starmap()} parallels the distinction | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |   between \code{function(a,b)} and \code{function(*c)}. | 
					
						
							|  |  |  |   Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def starmap(function, iterable): | 
					
						
							|  |  |  |          iterable = iter(iterable) | 
					
						
							|  |  |  |          while True: | 
					
						
							|  |  |  |              yield function(*iterable.next()) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{funcdesc}{takewhile}{predicate, iterable} | 
					
						
							|  |  |  |   Make an iterator that returns elements from the iterable as long as | 
					
						
							|  |  |  |   the predicate is true.  Equivalent to: | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def takewhile(predicate, iterable): | 
					
						
							| 
									
										
										
										
											2003-08-08 02:40:28 +00:00
										 |  |  |          for x in iterable: | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  |              if predicate(x): | 
					
						
							|  |  |  |                  yield x | 
					
						
							|  |  |  |              else: | 
					
						
							|  |  |  |                  break | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-11-12 14:32:26 +00:00
										 |  |  | \begin{funcdesc}{tee}{iterable\optional{, n=2}} | 
					
						
							|  |  |  |   Return \var{n} independent iterators from a single iterable. | 
					
						
							| 
									
										
										
										
											2004-07-11 13:20:11 +00:00
										 |  |  |   The case where \code{n==2} is equivalent to: | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  |   \begin{verbatim} | 
					
						
							|  |  |  |      def tee(iterable): | 
					
						
							|  |  |  |          def gen(next, data={}, cnt=[0]): | 
					
						
							|  |  |  |              for i in count(): | 
					
						
							|  |  |  |                  if i == cnt[0]: | 
					
						
							|  |  |  |                      item = data[i] = next() | 
					
						
							|  |  |  |                      cnt[0] += 1 | 
					
						
							|  |  |  |                  else: | 
					
						
							|  |  |  |                      item = data.pop(i) | 
					
						
							|  |  |  |                  yield item | 
					
						
							|  |  |  |          it = iter(iterable) | 
					
						
							|  |  |  |          return (gen(it.next), gen(it.next)) | 
					
						
							|  |  |  |   \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-11-12 14:32:26 +00:00
										 |  |  |   Note, once \function{tee()} has made a split, the original \var{iterable} | 
					
						
							|  |  |  |   should not be used anywhere else; otherwise, the \var{iterable} could get | 
					
						
							|  |  |  |   advanced without the tee objects being informed. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  |   Note, this member of the toolkit may require significant auxiliary | 
					
						
							|  |  |  |   storage (depending on how much temporary data needs to be stored). | 
					
						
							| 
									
										
										
										
											2003-12-18 13:28:35 +00:00
										 |  |  |   In general, if one iterator is going to use most or all of the data before | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  |   the other iterator, it is faster to use \function{list()} instead of | 
					
						
							|  |  |  |   \function{tee()}. | 
					
						
							|  |  |  |   \versionadded{2.4} | 
					
						
							|  |  |  | \end{funcdesc} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \subsection{Examples \label{itertools-example}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The following examples show common uses for each tool and | 
					
						
							|  |  |  | demonstrate ways they can be combined. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | >>> amounts = [120.15, 764.05, 823.14] | 
					
						
							|  |  |  | >>> for checknum, amount in izip(count(1200), amounts): | 
					
						
							|  |  |  | ...     print 'Check %d is for $%.2f' % (checknum, amount)
 | 
					
						
							|  |  |  | ... | 
					
						
							|  |  |  | Check 1200 is for $120.15
 | 
					
						
							|  |  |  | Check 1201 is for $764.05
 | 
					
						
							|  |  |  | Check 1202 is for $823.14
 | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | >>> import operator | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  | >>> for cube in imap(operator.pow, xrange(1,5), repeat(3)): | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | ...    print cube | 
					
						
							|  |  |  | ... | 
					
						
							|  |  |  | 1 | 
					
						
							|  |  |  | 8 | 
					
						
							|  |  |  | 27 | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  | 64 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | >>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura', | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  |                   '', 'martin', '', 'walter', '', 'mark'] | 
					
						
							| 
									
										
										
										
											2003-06-28 05:44:36 +00:00
										 |  |  | >>> for name in islice(reportlines, 3, None, 2): | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | ...    print name.title() | 
					
						
							|  |  |  | ... | 
					
						
							|  |  |  | Alex | 
					
						
							|  |  |  | Laura | 
					
						
							|  |  |  | Martin | 
					
						
							|  |  |  | Walter | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  | Mark | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  | # Show a dictionary sorted and grouped by value | 
					
						
							|  |  |  | >>> from operator import itemgetter | 
					
						
							|  |  |  | >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3) | 
					
						
							| 
									
										
										
										
											2003-12-17 20:43:33 +00:00
										 |  |  | >>> di = sorted(d.iteritems(), key=itemgetter(1)) | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  | >>> for k, g in groupby(di, key=itemgetter(1)): | 
					
						
							|  |  |  | ...     print k, map(itemgetter(0), g) | 
					
						
							|  |  |  | ... | 
					
						
							|  |  |  | 1 ['a', 'c', 'e'] | 
					
						
							|  |  |  | 2 ['b', 'd', 'f'] | 
					
						
							|  |  |  | 3 ['g'] | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-01-20 20:04:40 +00:00
										 |  |  | # Find runs of consecutive numbers using groupby.  The key to the solution | 
					
						
							|  |  |  | # is differencing with a range so that consecutive numbers all appear in | 
					
						
							|  |  |  | # same group. | 
					
						
							|  |  |  | >>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28] | 
					
						
							|  |  |  | >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x): | 
					
						
							|  |  |  | ...     print map(operator.itemgetter(1), g) | 
					
						
							|  |  |  | ...  | 
					
						
							|  |  |  | [1] | 
					
						
							|  |  |  | [4, 5, 6] | 
					
						
							|  |  |  | [10] | 
					
						
							|  |  |  | [15, 16, 17, 18] | 
					
						
							|  |  |  | [22] | 
					
						
							|  |  |  | [25, 26, 27, 28] | 
					
						
							| 
									
										
										
										
											2003-12-06 16:23:06 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \end{verbatim} | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \subsection{Recipes \label{itertools-recipes}} | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | This section shows recipes for creating an extended toolset using the | 
					
						
							|  |  |  | existing itertools as building blocks. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | The extended tools offer the same high performance as the underlying | 
					
						
							|  |  |  | toolset.  The superior memory performance is kept by processing elements one | 
					
						
							|  |  |  | at a time rather than bringing the whole iterable into memory all at once. | 
					
						
							|  |  |  | Code volume is kept small by linking the tools together in a functional style | 
					
						
							|  |  |  | which helps eliminate temporary variables.  High speed is retained by | 
					
						
							|  |  |  | preferring ``vectorized'' building blocks over the use of for-loops and | 
					
						
							|  |  |  | generators which incur interpreter overhead. | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | 
 | 
					
						
							|  |  |  | \begin{verbatim} | 
					
						
							| 
									
										
										
										
											2003-09-08 23:58:40 +00:00
										 |  |  | def take(n, seq): | 
					
						
							|  |  |  |     return list(islice(seq, n)) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  | def enumerate(iterable): | 
					
						
							|  |  |  |     return izip(count(), iterable) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def tabulate(function): | 
					
						
							|  |  |  |     "Return function(0), function(1), ..." | 
					
						
							|  |  |  |     return imap(function, count()) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def iteritems(mapping): | 
					
						
							|  |  |  |     return izip(mapping.iterkeys(), mapping.itervalues()) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def nth(iterable, n): | 
					
						
							|  |  |  |     "Returns the nth item" | 
					
						
							|  |  |  |     return list(islice(iterable, n, n+1)) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-05 16:47:36 +00:00
										 |  |  | def all(seq, pred=bool): | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  |     "Returns True if pred(x) is True for every element in the iterable" | 
					
						
							| 
									
										
										
										
											2004-09-23 07:27:39 +00:00
										 |  |  |     for elem in ifilterfalse(pred, seq): | 
					
						
							|  |  |  |         return False | 
					
						
							|  |  |  |     return True | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-05 16:47:36 +00:00
										 |  |  | def any(seq, pred=bool): | 
					
						
							| 
									
										
										
										
											2004-09-23 07:27:39 +00:00
										 |  |  |     "Returns True if pred(x) is True for at least one element in the iterable" | 
					
						
							|  |  |  |     for elem in ifilter(pred, seq): | 
					
						
							|  |  |  |         return True | 
					
						
							|  |  |  |     return False | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-05 16:47:36 +00:00
										 |  |  | def no(seq, pred=bool): | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  |     "Returns True if pred(x) is False for every element in the iterable" | 
					
						
							| 
									
										
										
										
											2004-09-23 07:27:39 +00:00
										 |  |  |     for elem in ifilter(pred, seq): | 
					
						
							|  |  |  |         return False | 
					
						
							|  |  |  |     return True | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-05 16:47:36 +00:00
										 |  |  | def quantify(seq, pred=bool): | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  |     "Count how many times the predicate is True in the sequence" | 
					
						
							|  |  |  |     return sum(imap(pred, seq)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def padnone(seq): | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  |     """Returns the sequence elements and then returns None indefinitely. | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  |     Useful for emulating the behavior of the built-in map() function. | 
					
						
							|  |  |  |     """ | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  |     return chain(seq, repeat(None)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def ncycles(seq, n): | 
					
						
							|  |  |  |     "Returns the sequence elements n times" | 
					
						
							|  |  |  |     return chain(*repeat(seq, n)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def dotproduct(vec1, vec2): | 
					
						
							|  |  |  |     return sum(imap(operator.mul, vec1, vec2)) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  | def flatten(listOfLists): | 
					
						
							|  |  |  |     return list(chain(*listOfLists)) | 
					
						
							|  |  |  | 
 | 
					
						
							|  |  |  | def repeatfunc(func, times=None, *args): | 
					
						
							| 
									
										
										
										
											2004-05-01 08:31:36 +00:00
										 |  |  |     """Repeat calls to func with specified arguments. | 
					
						
							|  |  |  |      | 
					
						
							|  |  |  |     Example:  repeatfunc(random.random) | 
					
						
							|  |  |  |     """ | 
					
						
							| 
									
										
										
										
											2003-10-24 08:45:23 +00:00
										 |  |  |     if times is None: | 
					
						
							|  |  |  |         return starmap(func, repeat(args)) | 
					
						
							|  |  |  |     else: | 
					
						
							|  |  |  |         return starmap(func, repeat(args, times)) | 
					
						
							|  |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-10-26 15:34:50 +00:00
										 |  |  | def pairwise(iterable): | 
					
						
							|  |  |  |     "s -> (s0,s1), (s1,s2), (s2, s3), ..." | 
					
						
							|  |  |  |     a, b = tee(iterable) | 
					
						
							| 
									
										
										
										
											2003-11-12 14:32:26 +00:00
										 |  |  |     try: | 
					
						
							|  |  |  |         b.next() | 
					
						
							|  |  |  |     except StopIteration: | 
					
						
							|  |  |  |         pass | 
					
						
							|  |  |  |     return izip(a, b) | 
					
						
							| 
									
										
										
										
											2003-08-25 05:06:09 +00:00
										 |  |  | 
 | 
					
						
							| 
									
										
										
										
											2003-02-01 00:10:11 +00:00
										 |  |  | \end{verbatim} |