photo

2010-03-09

Anatoliy Larin | Software developer

List clustering

Today I will tell you about just one line of code:

zip(*[a.__iter__()]*3)

So, if you know, how it works, you can skip to the comments right now. :)

This statement may look short and simple but it's actually very interesting and, sometimes, useful Python construction. In a job interviews we ask people to tell how does it work. If a person gets along with it, well, this is a good sign.

Here is the example of what it does:

>>> a = [1,2,3,4,5,6,7,8,10]
>>> zip(*[a.__iter__()]*3)
[(1, 2, 3), (4, 5, 6), (7, 8, 10)]

Lets look at it piecemeal.

__iter__()

Returns iterator object for a list (or tuple or any other iterable object).

This method is used by the interpreter when he encounters a list in, for example, a for loop.

The iterator object by itself has a next() method, which returns list items one by one.

[]*3

Multiplication of a list creates a new list repeating given amount of times the items from source list:

>>> [1,2,3] * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]

Important thing here: all identifiers in Python are references. It means that [a.__iter__()]*3 gives us a list of three references on the same iterator object.

A little example (note the address of listiterator object is the same):

>>> iters = [a.__iter__()]*3
>>> print iters
[<listiterator object at 0x85db0>, <listiterator object at 0x85db0>, <listiterator object at 0x85db0>]
>>> (iters[0].next(), iters[1].next(), iters[2].next())
(1, 2, 3)

*

Takes an iterable object and passes it's members to a function as non-keyword arguments.

This is one of my favourite Python features. The following lines are equivalent:

my_func(*[1,2,3])
my_func(1,2,3)

zip

zip returns a list of tuples, where the i-th tuple consists of i-th member of each function argument. For example:

>>> zip((1,2), (3,4), (4,5))
[(1, 3, 4), (2, 4, 5)]

Official Python documentation declares:

The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n).

You may have noticed the construction we are trying to understand here. So, this isn't some Perl one-liner, but an official Python feature.

Use the power of Python,
and may the Force be with you.

Comments
  • Vadim Fint comments...
    Yes and no. Resulting oneliner is quite unreadable. This one is a little bit faster and more readble, but produces lists, not tuples: [x[n:n+3] for n in xrange(0,len(x)-1,3)] In any case - better approach is to write nice small generators. Of course, they will be a lot slower than zipping one iterator several times. But if you need speed - write stuff in C. Oh, yeah, maybe you want to squeeze your brain a little... Sure, this is perfect task for that :) Any person can write complex oneliners in python (generators, iterators, etc.). They are quite fast, yes. But... wait a minute... we already have erlang for them! Huge oneliners :).
    2010-04-28
  • Anatoliy Larin
    Thank you for response. Yep, you variant is more readable, but no so funny :) And I think that Erlang-like oneliners is not so bad in Python. Python is many-sided and we can use right ability in right place. http://divmod.org/trac/wiki/PotatoProgramming
    2010-04-29
Add comment
* :
* :
:
* :
* :
captcha

© 2005 - 2010 e-Legion Ltd.