Copying Objects¶
With both lists and dictionaries,
using an assignment statement like list_b = list_a
does NOT make a separate
copy of the data. Instead, the statement just creates a new label (list_b
)
that points to the same memory location.
To make an independent clone of a list or dictionary, we use the .copy()
method. Lists and dictionaries are both object types, and the class for each
one contains a set of code for making the copy.
What if an object does not have a .copy()
method? We can still make an
independent clone, but the process is a little more involved.
Shallow Copy¶
Let’s begin by reviewing how a simple assignment statement operates on an object.
Example
Line 11 creates a new object of type Student
, and line 12 creates a new
label for the same object. student_1
and student_2
point to the
same memory location.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | class Student:
def __init__(self, name, id, scores):
self.name = name
self.id = id
self.scores = scores
def average(self):
return round(sum(self.scores)/len(self.scores), 1)
def main():
student_1 = Student("Maria", 1234, [88, 95, 93])
student_2 = student_1
student_2.id = 7890 # Reassign the id value using student_2.
# Print out the property values for student_1 and student_2.
print("student_1 =", vars(student_1))
print("student_2 =", vars(student_2))
student_3 = student_1.copy()
main()
|
Console Output
student_1 = {'name': 'Maria', 'id': 7890, 'scores': [88, 95, 93]}
student_2 = {'name': 'Maria', 'id': 7890, 'scores': [88, 95, 93]}
File "main.py", line 20, in main
student_3 = student_1.copy()
AttributeError: 'Student' object has no attribute 'copy'
The change we make to the id
property on line 14 shows up when print
both student_1
and student_2
. They are the same object, so changes
made with one label show up when we use the other.
Note that line 20 throws an error. While list and dictionary objects both
have defined .copy()
methods, our Student
object type does not.
Fortunately, we do not need to add a method to our Student
class to copy
the objects. Python comes with a module, called copy
, that contains the
functions we need.
Try It!
In the editor below, note that we import one function from the copy
module. Not surprisingly, it’s called copy
.
The general syntax is:
from copy import copy
new_object = copy(old_object)
Run the program as-is first to verify that
student_1
andstudent_2
are the same object. Note the output from the threeprint
statements.Replace line 14 with the statement
student_2 = copy(student_1)
.Rerun the program and note the change in the output. With the updated code,
student_1
andstudent_2
now represent different objects.Note that
student_1 is student_2
evaluates toFalse
even when all of their property values match.
While convenient, the copy()
function is not the full story. It produces
what programmers call a shallow copy of an object. We will see what this
means in the next section.
Deep Copy¶
A shallow copy of an object creates a new, independent object. However, some of the data in the new and old objects might still be linked. We can see this in the following example.
Example
Once again, we create the student_1
object from the Student
class
and clone it with student_2 = copy(student_1)
.
Let’s see what happens when we try changing one value in the list assigned
to the scores
property.
11 12 13 14 15 16 17 18 19 20 21 22 23 24 | def main():
student_1 = Student("Maria", 1234, [88, 75, 93])
student_2 = copy(student_1) # Make a shallow copy of student_1.
# Reassign the id value in student_2:
student_2.id = 7890
# Reassign the first value in the student_2 scores list:
student_2.scores[0] = 100
print("student_1 =", vars(student_1))
print("student_2 =", vars(student_2))
main()
|
Console Output
student_1 = {'name': 'Maria', 'id': 1234, 'scores': [100, 75, 93]}
student_2 = {'name': 'Maria', 'id': 7890, 'scores': [100, 75, 93]}
Hmmm. The output shows us that changing the id
value for student_2
does
NOT change the id
for student_1
. However, changing the first value in the
scores
list for student_2
affects BOTH objects. Even though
student_1
and student_2
are different objects, they are not quite
independent of each other yet.
The reason for this involves how the scores
property relates to the list.
When line 12 calls the class and __init__
runs, the value assigned to
scores
isn’t [88, 75, 93]
. Instead scores
is assigned a reference
to a memory location. The actual list is stored at that memory location. The
value assigned to scores
just points to it.
When we make the shallow copy in line 13, student_2
assigns the same
memory reference to scores
. Even though the two objects are different, both
scores
properties point to the same data in memory. This is why changing
[88, 75, 93]
to [100, 75, 93]
for student_2
also affects
student_1
.
Think of each Student
object as having two layers inside of it. The first
layer includes references to memory locations. The second layer is the actual
data stored. A shallow copy only goes one layer deep. It duplicates the memory
references, but it does not create new sets of the original data.
To make a full, independent clone of an object, we must make a deep copy. A deep copy takes the original data and creates clones of that data in new memory locations. The cloned object uses these new locations as its property values.
The syntax for making a deep copy is very similar to using copy()
:
from copy import deepcopy
new_object = deepcopy(old_object)
Try It!
In the editor above:
Replace line 1 with
from copy import copy, deepcopy
.Replace line 14 with
student_2 = deepcopy(student_1)
.Add
student_2.scores[0] = 100
before the finalprint
statements.Rerun the program to verify that changing the
scores
values forstudent_2
no longer affectsscores
forstudent_1
.