One aspect of Python programming that trips up those coming from languages like C or Java is how arguments are passed to functions in Python. At a more fundamental level, the confusion arises from a misunderstanding about Python object-centric data model and its treatment of assignment. When asked whether Python function calling model is "call-by-value" or "call-by-reference", the correct answer is: neither. Indeed, to try to shoe-horn those terms into a conversation about Python's model is misguided. "call-by-object," or "call-by-object-reference" is a more accurate way of describing it. But what does "call-by-object" even mean?
In Python, (almost) everything is an object. What we commonly refer to as "variables" in Python are more properly called names. Likewise, "assignment" is really the binding of a name to an object. Each binding has a scope that defines its visibility, usually the block in which the name originates.
That's a lot of terminology all at once, but those basic terms form the cornerstone of Python's execution model. Compared to, say, C++, the differences are subtle yet important. A concrete example will highlight these differences. Think about what happens when the following C++ code is executed:
1 2 3
In the above, the variable
some_guy refers to a location in memory, and the value 'Fred' is inserted in that location (indeed, we can take the address of
some_guy to determine the portion of memory to which it refers). Later, the contents of the memory location referred to by
some_guy are changed to 'George'. The previous value no longer exists; it was overwritten. This likely matches your intuitive understanding (even if you don't program in C++).
Let's now look at a similar block of Python code:
1 2 3
Binding Names to Objects
On line 1, we create a binding between a name,
some_guy, and a string object containing 'Fred'. In the context of program execution, the environment is altered; a binding of the name
some_guy' to a string object is created in the scope of the block where the statement occurred. When we later say
some_guy = 'George', the string object containing 'Fred' is unaffected. We've just changed the binding of the name
some_guy. We haven't, however, changed either the 'Fred' or 'George' string objects. As far as we're concerned, they may live on indefinitely.
With only a single name binding, this may seem overly pedantic, but it becomes more important when bindings are shared and function calls are involved. Let's say we have the following bit of Python code:
1 2 3 4 5 6 7 8 9 10
So what get's printed in the final line? Well, to start, the binding of
some_guy to the string object containing 'Fred' is added to the block's namespace. The name
first_names is bound to an empty list object. On line 4, a method is called on the list object
first_names is bound to, appending the object
some_guy is bound to. At this point, there are still only two objects that exist: the string object and the list object.
first_names both refer to the same object (Indeed,
print(some_guy is first_names) shows this).
Let's continue to break things down. On line 6, a new name is bound:
another_list_of_names. Assignment between names does not create a new object. Rather, both names are simply bound to the same object. As a result, the string object and list object are still the only objects that have been created by the interpreter. On line 7, a member function is called on the object
another_list_of_names is bound to and it is mutated to contain a reference to a new object: 'George'. So to answer our original question, the output of the code is
Bill ['Fred', 'George'] ['Fred', 'George']
This brings us to an important point: there are actually two kinds of objects in Python. A mutable object exhibits time-varying behavior. Changes to a mutable object are visible through all names bound to it. Python's lists are an example of mutable objects. An immutable object does not exhibit time-varying behavior. The value of immutable objects can not be modified after they are created. They can be used to compute the values of new objects, which is how a function like string.join works. When you think about it, this dichotomy is necessary because, again, everything is an object in Python. If integers were not immutable I could change the meaning of the number '2' throughout my program.
It would be incorrect to say that "mutable objects can change and immutable ones can't", however. Consider the following:
1 2 3 4 5
Tuples in Python are immutable. We can't change the tuple object
name_tuple is bound to. But immutable containers may contain references to mutable objects like lists. Therefore, even though
name_tuple is immutable, it "changes" when 'Igor' is appended to
first_names on the last line. It's a subtlety that can sometimes (though very infrequently) prove useful.
By now, you should almost be able to intuit how function calls work in Python. If I call
foo(bar), I'm merely creating a binding within the scope of
foo to the object the argument
bar is bound to when the function is called. If
bar refers to a mutable object and
foo changes its value, then these changes will be visible outside of the scope of the function.
1 2 3 4 5 6 7 8 9
On the other hand, if
bar refers to an immutable object, the most that
foo can do is create a name
bar in its local namespace and bind it to some other object.
1 2 3 4 5 6 7 8 9
Hopefully by now it is clear why Python is neither "call-by-reference" nor "call-by-value". In Python a variable is not an alias for a location in memory. Rather, it is simply a binding to a Python object. While the notion of "everything is an object" is undoubtedly a cause of confusion for those new to the language, it allows for powerful and flexible language constructs, which I'll discuss in my next post.Posted on by Jeff Knupp