JavaScript week JavaScript week
This week up to 80% off on HTML/CSS and JavaScript courses.

Lesson 4 - Object References, Cloning, and Garbage Collector in Python

In the previous lesson, RollingDie in Python - Constructors and random numbers, we created our first regular object in Python, a rolling die. Since we start working with objects, it's important to know what exactly is going on inside the program, otherwise, we'd end up with undesired results. This is what today's Python OOP tutorial is going to be all about.

An application, more so, its thread, allocates memory from the operating system in the form of a stack. It accesses this memory at very high speeds, but the application can't control its size and the resources are assigned by the operating system.

Let's create a new console application and add a simple class that will represent a "user". For clarity, we'll omit comments and won't bother with private members:

class User:

    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __str__(self):
        return str(self.name)

The class has two simple public attributes, a constructor and an overriden __str__(), so users can be printed simply. Let's create an instance of this class in our program:

u = User("James Brown", 28)

A variable u is an object reference. Let's see how this situation looks like in memory:

Stack and heap in computer memory

Both stack and heap are located in the RAM memory. The difference is in the access and in the size. The heap is almost unlimited memory, which is, however, complicated to access so it ends up being slower. On the other hand, the stack memory is fast, but limited in size.

Objects are actually stored in the memory twice, once in the stack and once in the heap. Within the stack there is something we call a reference, a link to the heap where an actual object can be found.

There are several reasons why things are done this way:

  1. The stack size is limited.
  2. When we want to use the same object multiple times, e.g. to pass it as a parameter into several methods, we don't need to copy it. We only have to pass a small value type containing the reference to the object instead of copying a whole heavy-weight object. We're going to demonstrate this today.

Now let's declare two variables of the User type:

u = User("James Brown", 28)
v = User("Jack White", 32)

Here's what this would look like in memory:

References in computer memory in Python

Now let's assign the v variable to the u variable. When it comes to objects, only its reference is copied but we still have only one object. Assigning references does not create new objects. Now, our code should look something like this:

u = User("James Brown", 28)
v = User("Jack White", 32)
u = v

Memory-wise, it would look like so:

References in computer memory in Python

Now, let's verify the reference mechanism, so we can confirm that it truly works this way :) First, we'll print all 2 variables before and after re-assigment. Since, we'll be printing several times, I will make the snippet short. Let's modify the code:

# variable declaration
u = User("James Brown", 28)
v = User("Jack White", 32)
print("u: {0}\nv: {1}".format(u, v))
print("u: {0}\nv: {1}\n".format(id(u), id(v)))
# assignment
u = v
print("u: {0}\nv: {1}".format(u, v))
print("u: {0}\nv: {1}\n".format(id(u), id(v)))
input()

The program output:

Console application
u: James Brown
v: Jack White
u: 4737008
v: 4737104

u: Jack White
v: Jack White
u: 4737104
v: 4737104

Let's change the name of the user v and based off what we know, the change should be reflected in the variable u. We'll add the following code to our program:

# change
v.name = "John Doe"
print("u: {0}\nv: {1}".format(u, v))
print("u: {0}\nv: {1}\n".format(id(u), id(v)))

We've changed the object in the variable v. Now let's print u and v once more:

Console application
u: James Brown
v: Jack White
u: 6309872
v: 6309968

u: Jack White
v: Jack White
u: 6309968
v: 6309968

u: John Doe
v: John Doe
u: 6309968
v: 6309968

The user u changes along with v because both variables point to the same object. Let's get back to James Brown:

References in computer memory in Python

Now what will happen to him, you ask? He'll be "eaten" by what we call the Garbage collector.

Garbage collector

Garbage collector and dynamic memory management

We can allocate memory statically in our programs, meaning that we declare how much memory we'll need in the source code. But we also don't need to specify how much memory we need. In this case, we dealt with dynamic memory management.

In the past, particularly in the era of the languages C, Pascal, and C++, direct memory pointers were used for what we call references in Python. Altogether, it worked like this: we'd ask the operating system for a piece of memory of certain size. Then, it would reserve it for us and give us its address. We would then create a pointer to this place, through which we worked with the memory. The problem was that no one was looking after what we put into this memory, the pointer just pointed to the beginning of the reserved memory. When we put something larger there, it would be simply stored anyway and overwrite the data beyond our memory's limits, which belonged to some another program or even to the operating system (in this case, OS would probably kill or stop our application). We would often overwrite our program's data in the memory and the program would start to behave chaotically. Imagine that you add a user to an array and it ends up changing the user's environment color which is something that has nothing to do with it. You would spend hours checking the code for mistakes, and you would end up finding out that there's a memory leak in the user's creation that overflew into the color values in memory.

The other problem was when we stopped using an object, we had to free its memory manually, and if we didn't, the memory would remain occupied. If we did this in a method and forgot to free the memory, our application would start to freeze. Eventually, it would crash the entire operating system. An error like this is very hard to pin-point. Why does the program stop working after a few hours? Where in thousands of lines of code should we look for the mistake? We have no clue. We can't follow anything, so we'd end up having to look through the entire program line by line or examining the computer memory which is in binary. cringes. A similar problem occurs when we free memory somewhere and then use the same pointer again, forgetting it has been already freed, it would point to a place where something new might be already stored, and we would corrupt this data. It would lead to uncontrollable behavior in our application and it could even lead to this:

Blue Screen Of Death – BSOD in Windows

A colleague of mine once said: "The human brain can't even deal with its own memory, so how could we rely on it for program memory management?" Of course, he was right, except for a small group of geniuses, people became tired of solving permanent and unreasonable errors. For the price of a slight performance decrease, managed languages were developed with what we call a Garbage collector, these include Python. C++ is still used of course, but only for specific programs, e.g. for operating system parts or commercial 3D game engines where you need to maximize the system's performance. Python is suitable for 99% of all other applications, due to its automatic memory management.

Garbage collector

Garbage collector is a program that runs in parallel with our applications, in a separate thread. It wakes up time after time and looks in memory for objects to which there is no longer a reference. It removes them and frees the memory. The performance loss is minimal and it'll significantly reduce the suicide rate of programmers who're trying to debug broken pointers in the evenings. We can even affect how GC runs in the code, although it's not needed in 99% of cases. Because the language is managed and doesn't work with direct pointers, it isn't possible to disrupt the memory anyhow, letting it overflow etc., the interpreter will take care of the memory automatically.

The None value

The last thing I'll mention here is the None value. Reference types can, unlike the value ones, contain a special value - None. None is a keyword and it indicates that the reference doesn't point to any data. When we set the variable v to None, we only delete this one reference. If there are still any references to our object, it will still exist. If not, GC will remove the object. Let's change the last lines of our program to:

# another change
v.name = "John Doe"
v = None

The output:

Console application
u: James Brown
v: Jack White
u: 6440944
v: 6441040

u: Jack White
v: Jack White
u: 6441040
v: 6441040

u: John Doe
v: John Doe
u: 6441040
v: 6441040

u: John Doe
v: None
u: 6441040
v: 1731209056

We can see that the object still exists and the variable u points to it; however, there is no reference in the variable v anymore. None is often used totgether with the is keyword, but we'll look at that some other time.

Copying objects

If you ask how to create a true copy of an object, you can recreate the object again with the constructor and put the same data into it, or we can use a deep copy. We'll return to our rolling die from the previous episode:

class Dice:
    """
    Class representing a die for a board game
    """

    def __init__(self, sides_count = 6):
        self.__sides_count = sides_count

    def __str__(self):
        """
        Returns a textual representation of our die
        """
        return str("A rolling die with {0} sides".format(self.__sides_count))

    def get_sides_count(self):
        return self.__sides_count

    def roll(self):
        """
        Rolls a die and returns a number from 1
        to the sides count
        """
        import random as _random
        return _random.randint(1, self.__sides_count)

We can add a method similar to the __str__() to the rolling die, the __repr__() method. This method also returns a string that can then be passed to the eval() function. This function is used for dynamic code execution. However, letting our user enter the expression is quite dangerous. Our __repr__() method will return the die's constructor:

def __repr__(self):
    """
    Returns the constructor code as a string for the eval() function
    """
    return str("RollingDie({0})".format(self.__sides_count))

Code to create a new die:

another_sixsided = eval(repr(sixsided))

Or we can use the copy module and its deepcopy() function:

another_sixsided = copy.deepcopy(sixsided)

In the next lesson, Warrior for the arena in Python, we'll program something practical again to gain experience. Spoiler: we're making a warrior object for the arena :)


 

 

Article has been written for you by David Capka
Avatar
Do you like this article?
No one has rated this quite yet, be the first one!
The author is a programmer, who likes web technologies and being the lead/chief article writer at ICT.social. He shares his knowledge with the community and is always looking to improve. He believes that anyone can do what they set their mind to.
Unicorn College The author learned IT at the Unicorn College - a prestigious college providing education on IT and economics.
Previous article
RollingDie in Python - Constructors and random numbers
All articles in this section
Object-Oriented Programming in Python
Thumbnail
Next article
Warrior for the arena in Python
Activities (1)

 

 

Comments

To maintain the quality of discussion, we only allow registered members to comment. Sign in. If you're new, Sign up, it's free.

No one has commented yet - be the first!