How to think like a Pythonista

This is an archive of part of a thread on comp.lang.python, kept here so I can post links to it when it seems appropriate (it dealt with a question that comes up relatively often).

If you want you can read the thread on google instead.

In the recent past (this is being written in April 2002, for reference), a searcher after enlightenment posted the following query to comp.lang.python:

Hello,

one thing I like very much about Python is that statements
work like you would expect them to work. Take for example
the use of dict.values() for dictionaries: If you store the
result of dict.values(), and change the dictionary after-
wards, the previously stored result remains untouched.

>>> dict = {'a':1,'b':2}
>>> list = dict.values()
>>> list
[1, 2]
>>> dict['a']=3
>>> list
[1, 2]
>>> dict
{'a': 3, 'b': 2}

However, if a dictionary has lists as value entries, I get
a counterintuitive behavior (which, just recently, broke my
code): If you change the dict, the list you previously
created via dict.values() gets automagically updated. A nice
feature, but nothing I would have expected!

>>> dict = {'a':[1],'b':[2]}
>>> list = dict.values()
>>> list
[[1], [2]]
>>> dict['a'].append(3)
>>> dict
{'a': [1, 3], 'b': [2]}
>>> list
[[1, 3], [2]]

Looks like that in the first case a copy is returned while
in the latter case list references are returned. Ok, but
according to Python's philosophy I shouldn't mind if I work
with lists in the dictionary or anything else. If the
behavior depends on the knowledge of the type of values I
put into a dictionary, I find that somehow counterintuitive.

Who is wrong here: my intuition or Python (2.2)? If it's
my intuition, how can I train my thinking about Python's
execution model, so that my intuition get's better ;-)

Almost needless to say, it was the poster's intuition that was at fault, but he is (was) far from unique in having this sort of misconception.

Luckily for him, two more Pythonically experienced posters -- myself and Alex Martelli -- were in a particularly pedagogical mood that day, and wrote lengthy articles explaining in rather different ways where he was going wrong.

I ranted about thinking in terms of "names, objects and bindings", something I don't do enough, and drew some ascii art diagrams explaining what was going on under the covers of the interactive sessions the OP was confused about:

> one thing I like very much about Python is that statements
> work like you would expect them to work.

Well, Python works very much as I expect it, but it's not clear if
this says more about me or Python <wink>.

At the end of your email, you say:

> Who is wrong here: my intuition or Python (2.2)? If it's
> my intuition, how can I train my thinking about Python's
> execution model, so that my intuition get's better ;-)

It's you :) As I can't read my email at the moment[1], I have no
better way of wasting my time to hand than drawing you some ascii art.

First, some terminology.  Actually, the very first thing is some
anti-terminology; I find the word "variable" to be particularly
uphelpful in a Python context.  I prefer "names", "bindings" and
"objects".

Names look like this:

    ,-----.
    | foo |
    `-----'

Names live in namespaces, but that's not really important for the
matter at hand as the only namespace in play is the one associated
with the read-eval-print loop of the interpreter.  In fact names are
only minor players in the current drama; bindings and objects are the
real stars.

Bindings look like this:

    ------------>

Bindings' left ends can be attached to names, or other "places" such
as attributes of objects and entries in lists or dictionaries.  Their
right hand ends are always attached to objects[2].

Objects look like this:

    +-------+
    | "bar" |
    +-------+

This is meant to be the string "bar".  Other types of object will be
drawn differently, but I hope you'll work out what I'm up to.

> Take for example the use of dict.values() for dictionaries: If you
> store the result of dict.values(), and change the dictionary after-
> wards, the previously stored result remains untouched.
> 
> >>> dict = {'a':1,'b':2}

After this statement, it would seem appropriate to draw this picture:

    ,------.       +-------+
    | dict |------>|+-----+|     +---+
    `------'       || "a" |+---->| 1 |
                   |+-----+|     +---+
                   |+-----+|     +---+
                   || "b" |+---->| 2 |
                   |+-----+|     +---+
                   +-------+

> >>> list = dict.values()

Now this:

    ,------.       +-------+
    | dict |------>|+-----+|             +---+
    `------'       || "a" |+------------>| 1 |
                   |+-----+|             +---+
                   |+-----+|              /\
                   || "b" |+-----.    ,---'
                   |+-----+|     |    |
                   +-------+     `----+----.
                                      |    |
    ,------.       +-----+            |    \/
    | list |------>| [0]-+------------'   +---+
    `------'       | [1]-+--------------->| 2 |
                   +-----+                +---+

> >>> list
> [1, 2]

Which is of course, no surprise.

> >>> dict['a']=3

Now this:


    ,------.       +-------+
    | dict |------>|+-----+|             +---+
    `------'       || "a" |+-.           | 1 |
                   |+-----+| |           +---+
                   |+-----+| |            /\
                   || "b" |+-+---.    ,---'
                   |+-----+| |   |    |
                   +-------+ |   `----+----.
                             |        |    |
    ,------.       +-----+   |        |    \/
    | list |------>| [0]-+---+--------'   +---+
    `------'       | [1]-+---+----------->| 2 |
                   +-----+   |            +---+
                             |            +---+
                             `----------->| 3 |
                                          +---+


> >>> list
> [1, 2]
> >>> dict
> {'a': 3, 'b': 2}

These should also come as no surprise; just chase the arrows
(bindings) above.

> However, if a dictionary has lists as value entries, I get
> a counterintuitive behavior (which, just recently, broke my
> code): If you change the dict, the list you previously
> created via dict.values() gets automagically updated. A nice
> feature, but nothing I would have expected!

That's because you're not thinking in terms of Names, Objects and
Bindings.

> >>> dict = {'a':[1],'b':[2]}

    ,------.       +-------+
    | dict |------>|+-----+|     +-----+   +---+
    `------'       || "a" |+---->| [0]-+-->| 1 |
                   |+-----+|     +-----+   +---+
                   |+-----+|     +-----+   +---+
                   || "b" |+---->| [0]-+-->| 2 |
                   |+-----+|     +-----+   +---+
                   +-------+

> >>> list = dict.values()

    ,------.       +-------+
    | dict |------>|+-----+|             +-----+   +---+
    `------'       || "a" |+------------>| [0]-+-->| 1 |
                   |+-----+|             +-----+   +---+
                   |+-----+|               /\
                   || "b" |+-----.    ,----'
                   |+-----+|     |    |
                   +-------+     `----+-----.
                                      |     |
    ,------.       +-----+            |     \/
    | list |------>| [0]-+------------'   +-----+   +---+
    `------'       | [1]-+--------------->| [0]-+-->| 2 |
                   +-----+                +-----+   +---+

> >>> list
> [[1], [2]]

Again, no surprises here.

> >>> dict['a'].append(3)

                                                    +---+
    ,------.       +-------+                     ,->| 1 |
    | dict |------>|+-----+|             +-----+ |  +---+
    `------'       || "a" |+------------>| [0]-+-'
                   |+-----+|             | [1]-+-.
                   |+-----+|             +-----+ |  +---+
                   || "b" |+-----.         /\    `->| 3 |
                   |+-----+|     |    ,----'        +---+
                   +-------+     |    |
                                 `----+-----.
    ,------.       +-----+            |     \/
    | list |------>| [0]-+------------'   +-----+   +---+
    `------'       | [1]-+--------------->| [0]-+-->| 2 |
                   +-----+                +-----+   +---+

> >>> dict
> {'a': [1, 3], 'b': [2]}
> >>> list
> [[1, 3], [2]]

And now these should not be surprising either.

> Looks like that in the first case a copy is returned while
> in the latter case list references are returned. Ok, but
> according to Python's philosophy I shouldn't mind if I work
> with lists in the dictionary or anything else. If the
> behavior depends on the knowledge of the type of values I
> put into a dictionary, I find that somehow counterintuitive.

If you haven't realised where you're misconceptions come from from the
above pictures, I'm not sure more prose would help.

Cheers,
M.
[1] Does anyone know where the starship's gone?
[2] Anyone mentioning UnboundLocalError at this point will be shot.

-- 
  A.D. 1517: Martin Luther nails his 95 Theses to the church door and
             is promptly moderated down to (-1, Flamebait).
        -- http://slashdot.org/comments.pl?sid=01/02/09/1815221&cid=52
                                        (although I've seen it before)

My access to email returned not long after posting, so my time wasting became more normal again. I might redraw the diagrams in dia or something someday (although probably only if my email goes down again...).

Alex took a different, wordier strategy, explaining that Python doesn't copy when it doesn't have to, recounting a very nice anecdote about a statue in Bologna and suggesting that the OP read some Borges, Calvino, Wittgenstein or Korzibsky:

> Hello,
> 
> one thing I like very much about Python is that statements
> work like you would expect them to work. Take for example
> the use of dict.values() for dictionaries: If you store the
> result of dict.values(), and change the dictionary after-
> wards, the previously stored result remains untouched.

The .values() method of a dictionary is defined to return
a new list of the values.  That's more or less inevitable, 
since a dictionary doesn't _have_ a list of its value 
normally, so it must build it on the fly when you ask for it.
It's not a copy -- it's a new list object.

However, Python does NOT copy except in situations where
a copy is specifically defined to happen.  The .values()
method being in a vague sense such a situation, as mentioned...
a new object, rather than a copy of any existing one.

In general, whenever possible, Python returns references 
to the same objects it already had around, rather than 
copying; if you DO want a copy you ask for it -- see module 
copy if you want to do so in a general way.  Of course,
building new objects is a different case.

If this is counteintuitive, so be it -- there is really
no alternative in the general case without imposing huge
overhead, making copies of everything "just in case".
MUCH better to get copies only on explicit request (and
new objects, when there's no existing object that could
either be copied or referred-to).

Of course there are in-between cases -- such as slices.

The standard sequences give you a new object when you 
ask for a slice; this only matters for lists (for immutable
objects you shouldn't care if you get copies or what).
A list is not able to "share a part of itself", so when
asked for a slice it gives out a copy, a new list (for
generality, of course, it then also does when asked for
a slice-of-everything, thelist[:] -- so in that limit case
the new object can be seen as a copy of the existing one).

The justly popular Numeric package, on the other hand,
defines an array type which IS able to share some or all 
data among several array objects -- so a slice of a Numeric 
array does share data with the array it's sliced from.  It's
a new object, mind you:

>>> import Numeric
>>> a=Numeric.array(range(6))
>>> b=a[:]
>>> id(a)
136052568
>>> id(b)
136052728
>>>

but the two distinct objects a and b do share data:

>>> a
array([0, 1, 2, 3, 4, 5])
>>> b
array([0, 1, 2, 3, 4, 5])
>>> a[3]=23
>>> b
array([ 0,  1,  2, 23,  4,  5])
>>>


Each behavior has excellent pragmatics behind it -- lists 
are _way_ simpler by not having to worry about data sharing, 
arrays have different use cases by far -- but it's hard to 
be unsurprising when two somewhat similar objects differ 
in such details.

But all of the copies which do "happen", e.g. by the
limit case of list slicing or whatever else (with ONE 
exception of which more later) are always SHALLOW copies.

NEVER does Python embark on the HUGE task of _deep_ copying 
unless you very specifically ask it to -- specifically with 
function deepcopy of module copy.  DEEP copying is a serious 
matter -- function deepcopy has to watch out for cycles, 
reproduce any identity of references, potentially follow 
references to any depth, recursively -- it has to reproduce
faithfully a graph of objects referencing each other
with unbounded complexity.  It works, but of course it
can never be as fast as the mundane business of shallow
copying (which in turn is never as fast as just handing
out one more reference to an existing object, whenever
the latter course of action is feasible).


So, that's what has apparently snagged you here:

> However, if a dictionary has lists as value entries, I get
> a counterintuitive behavior (which, just recently, broke my
> code): If you change the dict, the list you previously
> created via dict.values() gets automagically updated. A nice
> feature, but nothing I would have expected!

Not really -- if you change _objects to which the dict refers_
(rather than changing the dict in se), then other references
to just-the-same-objects remain references to just the same
objects -- if the objects mutate, you see the mutated objects
from whatever references to them you may be using.


>>>> dict = {'a':[1],'b':[2]}
>>>> list = dict.values()
>>>> list
> [[1], [2]]

Don't use the names of built-in types as variables: you WILL
be burned one day if you do this.  dict, list, str, tuple, file,
int, long, float, unicode... do NOT use these identifiers for 
your own purposes, tempting though they may be (an "attractive
nuisance", to be sure).  If you don't get into the habit of
avoiding them, one day you'll be trying to (e.g.) build a
list with x=list('ciao') and get puzzling errors... because
you have rebound identifier 'list' to refer to a certain list
object rather than to the list type itself.

Use alist, somedict, myfile, whatever... nothing to do with
your problem here, just some other simple advice!-)


>>>> dict['a'].append(3)

This does not "change the dictionary" -- the dictionary object
still contains exactly the same references, to objects with
the same id's (two string objects, the keys, and two list
objects, the values).  You're changing (mutating) one of those
objects, but that's quite another issue.  You could be
mutating said list object through any reference to it
whatsoever, e.g.:

>>> alist=list('ciao')
>>> adict={'a':alist}
>>> adict
{'a': ['c', 'i', 'a', 'o']}
>>> alist.pop()
'o'
>>> adict
{'a': ['c', 'i', 'a']}
>>>

If you wanted dictionary adict to refer to a COPY (a "snapshot",
if you will) of the contents of alist, you could have said so:

>>> import copy
>>> alist=list('ciao')
>>> adict={'a':copy.copy(alist)}
>>> adict
{'a': ['c', 'i', 'a', 'o']}
>>> alist.pop()
'o'
>>> adict
{'a': ['c', 'i', 'a', 'o']}
>>>

and then the dictionary object's string-representation would
be isolated from whatever changes to the list to which name
alist refers.  The string representation delegates part of its
job to the objects to which the dictionary object refers, so,
if you want to isolate it, you do need copies -- maybe deep
ones, in fact (<shudder>... well no, not really, but...:-).


>>>> dict
> {'a': [1, 3], 'b': [2]}
>>>> list
> [[1, 3], [2]]
> 
> Looks like that in the first case a copy is returned while
> in the latter case list references are returned. Ok, but

Nope.  ALWAYS references.  .values() doesn't return a reference
to an existing object NOR a copy of an existing object, because
there's no "existing object" in this case -- so it always
returns a NEW object, suitably built as per its specs.

> according to Python's philosophy I shouldn't mind if I work
> with lists in the dictionary or anything else. If the
> behavior depends on the knowledge of the type of values I
> put into a dictionary, I find that somehow counterintuitive.

There is no such dependence.  Just a huge difference
between changing an object, and changing (mutating) some
OTHER object to which the first refers.

In Bologna over 100 years ago we had a statue of a local hero
depicted pointing forwards with his finger -- presumably to
the future, but given where exactly it was placed, the locals
soon identified it as "the statue that points to Hotel
Belfiore".  The one day some enterprising developer bought
the hotel's building and restructured it -- in particular,
where the hotel used to be was now a restaurant, Da Carlo.

So, "the statue that points to Hotel Belfiore" had suddenly
become "the statue that points to Da Carlo"...!  Amazing
isn't it?  Considering that marble isn't very fluid and the
statue had not been moved or disturbed in any way...?

This is a real anecdote, by the way (except that I'm not
sure of the names of the hotel and restaurant involved --
I could be wrong on those), but I think it can still help
here.  The dictionary, or statue, has not changed at all,
even though the objects it refers/points to may have been
mutated beyond recognition, and the name people know it
by (the dictionary's string-representation) may therefore
change.  That name or representation was and is referring
to a non-intrinsic, non-persistent, "happenstance"
characteristic of the statue, or dictionary...


> Who is wrong here: my intuition or Python (2.2)? If it's
> my intuition, how can I train my thinking about Python's
> execution model, so that my intuition get's better ;-)

Your intuition, which led you astray here (Python does just
what it should do), can be trained in several ways.  The
works of J. L. Borges and I. Calvino, if you like fiction
that's reasonably sophisticated but still quite pleasant,
are good bets.  If you like non-fiction written by
engineers fighting hard to dispell some of the errors of
philosophers, Wittgenstein and Korzibsky are excellent.

I'm not kidding, but I realize that many Pythonistas don't
really care for either genre.  In which case, this group
and its archives, essays by GvR and /F, and the Python
sources, may also prove interesting reading.


Alex

The essay by /F that Alex was referring to was probably this one (and even if it wasn't, you should still read it). It talks about some of the same issues in a terser style.

And just to prove that there is some point in all this, the OP went away a satisfied customer:

Dear Michael, dear Alex,

you are excellent teachers!!!

Michael, you helped me really getting the point with your
drawings. Thanks a lot for your art work!

Alex, the anecdote about the statue pointing to Hotel Belfiore
made my wrong intuition so obvious! I like that and will never
ever forget it anymore! Thanks for your answer!

I think, today I learnt a lot on my way becoming a real Pythoniac!

I hope you found these answers useful too.


Well, he was lucky to catch me in such a mood. Alex writes articles like this all the time.

Valid XHTML 1.0! Valid CSS! Please send comments, praise, abuse, etc. to mwh@python.net.

If you want to translate this document, by all means feel free. It would be nice if you sent me a link to your translation. So far I know about