Python Pickle

So far, we wrote some programs where we stored only text into the files and retrieved same text from the files. These text files are useful when we do not want to perform any calculations on the data. What happens if we want to store some structured data in the files? For example, we want to store some employee details like employee identification number (int type), name (string type) and salary (float type) in a file. This data is well structured and got different types. To store such data, we need to create a class Employee with the instance variables id, name and sal as shown below:

CopiedCopy Code

class Emp: 
   def __init__(self, id, name, sal): 
      self.id = id 
      self.name = name 
      self.sal = sal 
   def display(self): 
      print("{:5d} {:20s} {:10.2f}".format(self.id, self.name, self.sal))

Pickle dump() method

Then we create an object to this class and store actual data into that object. Later, this object should be stored into a binary file in the form of bytes. This is called pickle or serialization . So, let's understand that pickle is a process of converting a class object into a byte stream so that it can be stored into a file. This is also called object serialization . Pickling is done using the dump() method of 'pickle' module as:

pickle.dump(object, file)

Pickle load() method

Unpickle is a process whereby a byte stream is converted back into a class object. It means, unpickling represents reading the class objects from the file. Unpickling is also called desearialization . Unpickling is done using the load() method of 'pickle' module as:

object = pickle.load(file)

Here, the load() method reads an object from a binary 'file' and returns it into 'object'. Let's remember that pickling and unpickling should be done using binary files since they support byte streams. The word stream represents data flow. So, byte stream represents flow of bytes.

A Python program to create an Emp class with employee details as instance variables.

CopiedCopy Code

#Emp class - Save this as Emp.py

class Emp: 
   def __init__(self, id, name, sal): 
      self.id = id 
      self.name = name 
      self.sal = sal 
   def display(self): 
      print("{:5d} {:20s} {:10.2f}".format(self.id, self.name, self.sal))

Our intention is to pickle Emp class objects. For this purpose, we have to import Emp.py file as a module since Emp class is available in that file.

import Emp

Now, an object to Emp class can be created as:

e = Emp.Emp(id, name, sal)

Please observe that 'e' is the object of Emp class. Since Emp class belongs to Emp module, we referred to it as Emp.Emp(). This object 'e' should be written to the file 'f' using dump() method of pickle module, as:

pickle.dump(e, f)
Pickle load() method >

Pickle dump() and load() example

CopiedCopy Code

import Emp, pickle 

f = open('emp.dat', 'wb') 
n = int(input('How many employees?')) 
for i in range(n): 
   id = int(input('Enter id:')) 
   name = input('Enter name:') 
   sal = float(input('Enter salary:')) 
   #create Emp class object 
   e = Emp.Emp(id, name, sal)
   pickle.dump(e, f) 
f.close()

In the previous program, we stored 3 Emp class objects into emp.dat file. If we want to get back those objects from the file, we have to unpickle them. To unpickle, we should use load() method of pickle module as:

obj = pickle.load(f)

This method reads an object from the file 'f' and returns it into 'obj'. Since this object 'obj' belongs to Emp class, we can use the display() method by using the Emp class to display the data of the object as:

obj.display()

In this way, using a loop, we can read objects from the emp.dat file. When we reach end of the file and could not find any more objects to read, then the exception 'EOFError' will occur. When this exception occurs, we should break the loop and come out of it.

A Python program to unpickle Emp class objects.

CopiedCopy Code

import Emp, pickle 
f = open('emp.dat', 'rb') 
print('Employees details:') 
while True: 
   try: 
     obj = pickle.load(f) 
     obj.display() 
   except EOFError: 
     print('End of file reached...') 
     break
f.close()