1. Introduction of references and tools
The treatment of variables in Python is very different from that of C. Variables in Python have a special property: identity, or “identity”. This special property is also known in many places as a “reference”.
To illustrate reference-related issues more clearly, we will first introduce two tools: a built-in Python function: id()
; an operator: is
; and also a function within the sys
module: getrefcount()
.
1.1 Built-in function id()
id(object)
This is an integer which is guaranteed to be unique and constant for this object during its Two objects with non-overlapping lifetimes may have the same [
id()
](https://docs.python.org/3.7/library/functions.html?highlight =id#id) value.The return value is the “identifier” of the incoming object. This identifier is a unique constant that corresponds to the incoming object during its lifetime. Two objects whose lifecycles do not overlap may have the same
id()
return value.
CPython implementation detail: This is the address of the object in memory.
CPython implementation detail: “identity” is actually the address of the object in memory.
– quoted from “Python 3.7.4 Documentation-Built-In-Functions-id()
In other words, an object’s id
can be treated as its virtual memory address, regardless of whether it is a CPython implementation or not.
1.2 The operator is
Operations | Meaning |
---|---|
is | object identity |
i.e., the role of
is
is to compare the identity of objects. – quoted from “Python 3.7.4 Documentation-Built-In Types
1.3 sys
module function getrefcount()
function
sys.getrefcount(object)
Return the reference count of the object. The count returned is generally one higher than you might expect, because it includes the (temporary) reference as an argument to
getrefcount()
.The return value is the reference count of the passed-in object. Since a temporary reference is generated when
getrefcount()
is passed as an argument, the returned count value is generally 1 more than expected.– Cited in [Python 3.7.4 Documentation - sys Module - System-Related Parameters and Functions](https://docs.python.org/3.7/library/sys. html#sys.getrefcount)
The “reference count” in this case is defined in the Python documentation as “the number of times an object referenced”. Once the reference count goes to zero, the memory where the object is located is freed. This is a mechanism for automatic memory management within Python.
2. Example questions
In C, a variable represents a fixed piece of memory, and the value assigned to it is the data that exists at that address; in Python, however, a variable is no longer a fixed address, but rather a label attached to each object in Python. Understanding this is important for understanding many of Python’s features.
2.1 Assigning values to the same variable
For example, for the following C code.
int c_variable = 10000;
printf("original address: %p\n", &a); // original address: 0060FEFC
c_variable = 12345;
printf("second address: %p\n", &a); // second address: 0060FEFC
To anyone with experience in C programming, the above result is obvious: the address of the variable c_variable
does not change just because the value assigned to it has changed. For the C compiler, the variable c_variable
is simply an identifier that assists it in distinguishing individual memory addresses and is directly bound to a specific memory address, as shown in the figure.
But Python is different. Consider the following code.
python_variable = 10000
id(python_variable)
# 1823863879824
python_variable = 12345
id(python_variable)
# 1823863880176
What is even more amazing is that even when the same constant is assigned to a variable, the resulting id
can be different.
python_variable = 10000
id(python_variable)
# 1823863880304
python_variable = 10000
id(python_variable)
# 1823863879408
If the data type corresponding to python_variable
is a list, then.
python_variable = [1,2]
id(python_variable)
# 2161457994952
python_variable = [1,2]
id(python_variable)
# 2161458037448
The id
value obtained is also different.
As mentioned earlier, in Python, a variable is a brick that is moved wherever it is needed. Every time you assign a new object to a variable, you recreate an object in memory that has a new reference value. As a “tag”, variables are also applied wherever they’re needed, without any sense of propriety.
But note that there is another problem here: the reason why “even if the same constant is assigned to a variable, the
id
obtained may be different” is that this is not the case for all constants. If we take the constant1
as an example, we have the following result.littleConst = 1 # integer objects with small values id(littleConst) # 140734357607232 littleConst = 1 id(littleConst) # 140734357607232 id(1) # 140734357607232
As you can see, the
id
corresponding to the constant1
is always the same and has not changed, so theid
of the variablelittleConst
has not changed either.This is because Python maintains a pool of a specific number of constants in memory, and no new objects are created for any value within a certain range, but are allocated directly in this pool of constants. In fact, using the following code on my machine, I can get this pool of constants in the range [0, 256], which is exactly the number of values that can be represented in a single byte of binary code.
for constant in range(300): if constant is not range(300)[constant]: print("The maximum value of the constant pool is:", (constant - 1)) break # The maximum value of the constant pool is: 256
Accordingly, adding, subtracting, multiplying and dividing values and assigning the results to the original variables will change the corresponding reference values of the variables.
change_ref = 10000
id(change_ref)
# 2161457772304
change_ref = change_ref + 1
change_ref
# 10001
id(change_ref)
# å2161457772880
Comparing the output of lines 3 and 8 of the code block, you can see that adding and assigning a value to a numeric variable changes the reference value of the corresponding variable. This should be easier to understand. Because according to Python operator precedence, change_ref = change_ref + 1
is actually change_ref = (change_ref + 1)
, and adding 1 to the value of the variable change_ref
gives a new value, which is then assigned to change_ref ref
, and the reference to change_ref
is changed. The same is true for lists.
list_change_ref = [1,2]
id(list_change_ref)
# 2161458326920
list_change_ref = list_change_ref + [4]
list_change_ref
# [1, 2, 4]
id(list_change_ref)
# 2161458342792
2.2 The case of no change
Unlike numeric values, operations on list objects in Python exhibit another characteristic. Consider the following code.
list_nonchange = [1, 2, 3]
id(list_nonchange)
# 2161458355400
list_nonchange[2] = 5
list_nonchange
# [1, 2, 5]
id(list_nonchange)
# 2161458355400
list_nonchange.append(3)
list_nonchange
# [1, 2, 5, 3]
id(list_nonchange)
# 2161458355400
Observe lines 3, 8, and 13 of the code block and the output is the same. That is, for a list, it is possible to change the referenced value without changing its reference by directly manipulating the variable itself.
Further, in the case of two variables referring to the same list at the same time, a direct operation on one of the variables itself will also affect the value of the other variable.
list_example = [1, 2, 3]
list_same_ref = list_example
id(list_example)
# 1823864610120
id(list_same_ref)
# 1823864610120
Obviously the variables list_example
and id
of list_same_ref
are the same at this point. Now change the value of the list referenced by list_example
.
list_example[2] = 5
list_same_ref
# [1, 2, 5]
You can see that the value of the list referenced by list_same_ref
has also changed. And look at the corresponding id
.
id(list_example)
# 1823864610120
id(list_same_ref)
# 1823864610120
The id
of both variables does not change. Calling the append()
method again.
list_example.append(3)
list_example
# [1, 2, 5, 3]
list_same_ref
# [1, 2, 5, 3]
id(list_example)
# 1823864610120
id(list_same_ref)
# 1823864610120
Delete element.
del list_example[3]
list_example
# [1, 2, 5]
list_same_ref
# [1, 2, 5]
id(list_example)
# 1823864610120
id(list_same_ref)
# 1823864610120
In all the above operations on the list, the references to the corresponding elements are not changed.
That is, operations on the variables themselves do not create new objects, but directly change the value of the original object.
2.3 A special place
This subsection example is inspired by [About references in Python]
There is also a special difference between numeric data and lists. Consider the following code.
num = 10000
id(num)
# 2161457772336
num += 1
id(num)
# 2161457774512
With the preceding padding, such a result seems natural. Apparently when the variable num
is incremented by 1, the new value is still calculated and then assigned, so the reference changes.
But the list is not. See the following code.
li = [1, 2, 3]
id(li)
# 2161458469960
li += [4]
id(li)
# 2161458469960
li
# [1, 2, 3, 4]
Note line 4. Why is the result different from the previous one when it is obvious that the operation is “add and assign”? Checking the value of the variable li
, we find that the value of the variable has indeed changed, but the reference has not changed.
In fact, this is because the addition operator is overloaded in Python; the underlying implementation of the addition operation is completely different for list objects and numeric objects; in simple addition, the list operation still creates a new list object; but in the abbreviated implementation of the addition operation +=
, it does not create a new list object. This is something to be very careful about.
3. Explanation of Principle
As we mentioned earlier [Python Variables and Data Types], the six standard data types in Python are actually divided into two main categories: variable data and **immutable data **. Lists, dictionaries, and collections are “mutable objects”, while numbers, strings, and tuples are “immutable objects”. In fact, the difference between numeric data (i.e., numbers) and lists as demonstrated above is the result of these two different data types.
Since numbers are immutable objects, we can’t do anything to the values themselves that would change the value of the data. So in Python, each occurrence of a value means that a new memory space needs to be allocated (with the exception of values in the constant pool).
const_ref = 10000 #
const_ref == 10000
# True
const_ref is 10000
# False
id(const_ref)
# 2161457773424
id(10000)
# 2161457773136
from sys import getrefcount
getrefcount(const_ref)
# 2
getrefcount(10000)
# 3
The first 9 lines of code are easy to understand: even the same value may have a different reference value. The key is whether the value comes from the same object.
The code in line 12 shows that except for the reference of the getrefcount()
function, the object referenced by the variable const_ref
has only one reference, which is the variable const_ref
. Once the variable const_ref
is freed, the corresponding object reference count is zeroed and freed; and only then is the memory space corresponding to this object truly “freed”.
As a mutable object, the value of the list can be changed without creating a new object, so it is possible to operate directly on the list object itself for the purpose of “changing the value of a variable without changing the reference”.
4. Summary
For variable objects such as lists, dictionaries, and collections, you can change the value of a variable without changing the reference of the variable by operating on the object itself; however, for immutable objects such as numbers, strings, and tuples, the object itself cannot be changed. However, for numbers, strings and tuples, which are “immutable objects”, the object itself is not capable of variable value operation, so in order to change the value of the corresponding variable, you must create a new object and then assign the new object to the variable.
Through such exploration, the profound meaning of “everything is an object” can also be understood more vividly. 0
5. References
Python 3.7.4 Documentation-Built-In Functions-id()
Python 3.7.4 Documentation - Built-in Types
Python 3.7.4 Documentation - sys module - system related parameters and functions