Modern applications almost always need to save data, whether it’s user settings, a saved game, or a to-do list. To make your programs more permanent, you must embrace storage and persistence. Persistence is the concept of storing data so that it survives after the program has closed.
Table of Contents
- 1. Embrace Storage and Persistence
- 1.1.1 💻 Basic File Storage
- 1.1.2 💻 Serialization with Pickle and JSON
- 1.1.3 💻 Persistent Dictionaries with Shelve
- 1.2 1. Why Bother?
- 1.3 2. The Easy Wins: Files
- 1.4 3. Pickle: Python’s Magic Snapshots (with a Warning Label)
- 1.5 4. Structured Storage: SQLite
- 1.6 5. Level Up: Full Databases
- 1.7 6. Object Storage: Just Throw It in the Cloud
- 1.8 7. Caching: Fast but Fleeting
- 1.9 8. Persistence Patterns
- 1.10 9. Practical Tips
- 1.11 10. Mini Decision Flow
- 1.12 More Topics
Python provides several powerful and easy-to-use tools for this, ranging from simple text files to more advanced serialization methods.
Embrace Storage and Persistence
💻 Basic File Storage
The most straightforward way to store data is in a plain text file. Using Python’s built-in `open()` function, you can create or open a file in write mode (`’w’`) and use the `.write()` method to save string data. To read it back, you open the file in read mode (`’r’`) and iterate through the lines. The main challenge with this method is that all data, whether it’s a number, a list, or a dictionary, must first be converted to a string before writing. This can become cumbersome for complex data structures.
💻 Serialization with Pickle and JSON
To avoid manually converting complex Python objects to strings, you can use serialization. This is the process of converting an object into a format that can be stored or transmitted. Python’s `pickle` module is a native tool that can serialize almost any Python object into a byte stream. A more universal option is JSON (JavaScript Object Notation), a human-readable text format. The `json` module in Python works just like `pickle` but produces data that is interoperable with many other programming languages, making it a popular choice for web applications and APIs.
💻 Persistent Dictionaries with Shelve
When you need to store many different Python objects and access them conveniently, the `shelve` module is an excellent solution. A ‘shelf’ acts like a persistent dictionary. You can open a shelf file and then assign Python objects to keys, just as you would with a regular dictionary: `shelf[“feeds”] = my_feed_list`. The `shelve` module uses `pickle` behind the scenes to handle the serialization automatically. This provides a simple and intuitive way to save and retrieve the state of your application’s data between sessions.
1. Why Bother?
Without persistence, your app has amnesia. Everything vanishes when the process dies. Storing data means:
- Restarting doesn’t wipe user progress.
- Sharing data between runs (configs, logs, user info).
- Scaling: multiple users/sessions actually work.
2. The Easy Wins: Files
Plain text, JSON, CSV… your basic “just dump it” approach.
import json
data = {"name": "Lila", "score": 99}
with open("save.json", "w") as f:
json.dump(data, f)
with open("save.json") as f:
loaded = json.load(f)
Good for configs, small datasets, or when you like living dangerously with merge conflicts.
3. Pickle: Python’s Magic Snapshots (with a Warning Label)
import pickle
obj = {"nums": [1, 2, 3]}
with open("data.pkl", "wb") as f:
pickle.dump(obj, f)
with open("data.pkl", "rb") as f:
loaded = pickle.load(f)
Pros: saves any Python object.
Cons: insecure if you load data from untrusted sources (it can execute code). So don’t use it on files strangers hand you.
4. Structured Storage: SQLite
Batteries-included SQL database that lives in one file. No server needed.
import sqlite3
con = sqlite3.connect("app.db")
cur = con.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS scores (name TEXT, points INT)")
cur.execute("INSERT INTO scores VALUES (?, ?)", ("Lila", 99))
con.commit()
cur.execute("SELECT * FROM scores")
print(cur.fetchall())
con.close()
Great middle ground: you get real queries, persistence, and no setup headache.
5. Level Up: Full Databases
When SQLite isn’t enough (multi-user, huge data, concurrency):
- PostgreSQL → powerhouse, reliable.
- MySQL/MariaDB → classic, widely supported.
- MongoDB → document-based, JSON-like.
Use them if your app grows beyond “single laptop science project.”
6. Object Storage: Just Throw It in the Cloud
For files, images, or blobs: S3, MinIO, or whatever the flavor-of-the-week provider is. Great when your “data” isn’t neat rows and columns.
7. Caching: Fast but Fleeting
Not all storage is forever. Sometimes you just want speed. That’s where Redis or Memcached come in—RAM-based persistence for hot data.
8. Persistence Patterns
- Config files: YAML, JSON, TOML.
- Logs: Append-only files.
- User data: Databases (SQLite → Postgres).
- Large assets: Object storage.
Mix and match, don’t try to cram everything into one method.
9. Practical Tips
- Always close files and DB connections (or use
with
). - Use migrations (Alembic, Flyway) if your schema changes.
- Backups aren’t optional. If it’s not backed up, it doesn’t exist.
- Encrypt sensitive data at rest.
10. Mini Decision Flow
- Is it tiny and human-readable? → JSON/YAML.
- Needs querying but single-user? → SQLite.
- Needs scale or multiple clients? → Postgres/MySQL.
- Just blobs/files? → Object storage.
- Temporary/fast? → Redis.
More Topics
- Python Coding Essentials: Reliability by Abstraction
- Python Coding Essentials: Different Types of Data
- Python Coding Essentials: Neater Code with Modules
- Python Coding Essentials: Lock Down with Data Encryption
- Python Coding Essentials: Files and Modules Done Quickly
- Python’s Itertools Module – How to Loop More Efficiently
- Python Multithreading – How to Handle Concurrent Tasks