In modern applications, storing and retrieving data efficiently is critical for maintaining application state, caching, and communication between different components. This process is broadly categorized as Data Persistence and Serialization.
Data persistence refers to the ability of an application to save data that outlasts the process that created it. In Python, this can be achieved using files, databases, or external storage systems. Persistent data remains available even after a program has terminated.
Serialization is the process of converting a data structure or object state into a format that can be stored or transmitted and reconstructed later. The reverse operation is called deserialization. Serialization is essential for data storage, inter-process communication, and network transmission.
Pickle is a module in Python used to serialize and deserialize Python object structures. It is Python-specific and may not be suitable for sharing data with other languages.
import pickle
data = {'name': 'Alice', 'age': 25, 'skills': ['Python', 'ML']}
# Serialize data to a file
with open('data.pkl', 'wb') as f:
pickle.dump(data, f)
# Deserialize data from a file
with open('data.pkl', 'rb') as f:
loaded_data = pickle.load(f)
print(loaded_data)
Be cautious when loading pickled data from untrusted sources. Malicious code can be executed during deserialization. Consider alternatives like JSON for safer data exchange.
JSON (JavaScript Object Notation) is a text format that is language-independent and used widely for data interchange. Python provides built-in support via the json module.
import json
data = {'name': 'Bob', 'age': 30, 'active': True}
# Serialize to JSON string
json_string = json.dumps(data)
print(json_string)
# Write JSON to a file
with open('data.json', 'w') as f:
json.dump(data, f)
# Read JSON from a file
with open('data.json', 'r') as f:
loaded_data = json.load(f)
print(loaded_data)
JSON supports only basic data types (strings, numbers, lists, dicts, booleans, and None). Custom objects need to be converted into a serializable format.
class User:
def __init__(self, name, age):
self.name = name
self.age = age
def user_serializer(obj):
if isinstance(obj, User):
return {'name': obj.name, 'age': obj.age}
raise TypeError("Type not serializable")
user = User('Charlie', 35)
json_data = json.dumps(user, default=user_serializer)
print(json_data)
YAML (YAML Ainβt Markup Language) is a human-readable data serialization standard commonly used for configuration files. Pythonβs PyYAML library can be used for YAML operations.
import yaml
data = {'name': 'Eve', 'skills': ['Python', 'Data Science']}
# Serialize to YAML
with open('data.yaml', 'w') as f:
yaml.dump(data, f)
# Deserialize from YAML
with open('data.yaml', 'r') as f:
loaded = yaml.safe_load(f)
print(loaded)
pip install pyyaml
CSV (Comma-Separated Values) is a simple format for storing tabular data. It is widely used for exporting spreadsheets and databases.
import csv
rows = [['Name', 'Age'], ['John', 28], ['Anna', 22]]
with open('people.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(rows)
with open('people.csv', 'r') as f:
reader = csv.reader(f)
for row in reader:
print(row)
XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding documents. It is commonly used in legacy systems and some APIs.
import xml.etree.ElementTree as ET
xml_data = '''
Alice
25
'''
root = ET.fromstring(xml_data)
print(root.find('name').text)
user = ET.Element('user')
name = ET.SubElement(user, 'name')
name.text = 'Bob'
age = ET.SubElement(user, 'age')
age.text = '30'
tree = ET.ElementTree(user)
tree.write('user.xml')
The shelve module provides a dictionary-like interface to persist arbitrary Python objects using a database file under the hood.
import shelve
with shelve.open('mydata') as db:
db['key1'] = {'name': 'John', 'age': 40}
with shelve.open('mydata') as db:
print(db['key1'])
Pickle supports multiple protocol versions for compatibility:
pickle.dump(obj, file, protocol=pickle.HIGHEST_PROTOCOL)
Python can pickle and unpickle instances of user-defined classes:
class Person:
def __init__(self, name):
self.name = name
p = Person('Tom')
with open('person.pkl', 'wb') as f:
pickle.dump(p, f)
with open('person.pkl', 'rb') as f:
loaded_person = pickle.load(f)
print(loaded_person.name)
In web applications, especially RESTful APIs, JSON is the de facto serialization format:
from flask import Flask, jsonify, request
app = Flask(__name__)
@app.route('/api', methods=['POST'])
def api():
data = request.json
return jsonify({'received': data})
| Format | Human Readable | Cross-Language | Use Case |
|---|---|---|---|
| Pickle | No | No | Python object persistence |
| JSON | Yes | Yes | APIs, Config files |
| YAML | Yes | Yes | Config files |
| CSV | Yes | Yes | Tabular data |
| XML | Yes | Yes | Legacy systems |
Understanding Pythonβs serialization and persistence tools is essential for developing robust and scalable applications. Whether you need to store simple config files, transmit data between systems, or persist complex Python objects, Python provides a rich set of modules to handle these needs efficiently.
Choosing the right serialization format depends on your use case:
Python is commonly used for developing websites and software, task automation, data analysis, and data visualisation. Since it's relatively easy to learn, Python has been adopted by many non-programmers, such as accountants and scientists, for a variety of everyday tasks, like organising finances.
Learning Curve: Python is generally considered easier to learn for beginners due to its simplicity, while Java is more complex but provides a deeper understanding of how programming works.
The point is that Java is more complicated to learn than Python. It doesn't matter the order. You will have to do some things in Java that you don't in Python. The general programming skills you learn from using either language will transfer to another.
Read on for tips on how to maximize your learning. In general, it takes around two to six months to learn the fundamentals of Python. But you can learn enough to write your first short program in a matter of minutes. Developing mastery of Python's vast array of libraries can take months or years.
6 Top Tips for Learning Python
The following is a step-by-step guide for beginners interested in learning Python using Windows.
Best YouTube Channels to Learn Python
Write your first Python programStart by writing a simple Python program, such as a classic "Hello, World!" script. This process will help you understand the syntax and structure of Python code.
The average salary for Python Developer is βΉ5,55,000 per year in the India. The average additional cash compensation for a Python Developer is within a range from βΉ3,000 - βΉ1,20,000.
Copyrights © 2024 letsupdateskills All rights reserved