Now that we have a working document-oriented database, we need to think about how we can make the data in our database persistent. In its current state, if the python script is stopped or crashes, all stored data will be lost. This is because our database exists only in the application's memory.
In production usage, data is stored in disk space because it's non-volatile, that is, the data doesn't disappear when the system is turned off. This is achieved through different methods such as serializing and writing to a file, or using the in-built methods provided by some databases.
Assuming our current implementation is a dictionary, we can use the json module that comes with python for serialization and deserialization of our data. Serialization is the process of converting our data structure into a format that can be stored. Deserialization is the reverse process, where we convert our stored format back into our data structure.
We could utilize Python's built-in json
module to effectively serialize our dictionary to a JSON file for disk storage, and load it back into the application memory as needed. This simple strategy transforms our volatile memory storage into a more persistent storage system - a significant leap towards robust application data management! In the code
you can see how we serialize our dictionary to a JSON file and load it back to memory.
Keep in mind that this is a simplified example, and actual databases use much more sophisticated techniques for managing and persisting data.
xxxxxxxxxx
import os
import json
if __name__ == "__main__":
db = {"AI": 1, "finance": 2, "programming": 3}
print('Before persisting: ', db)
# Save (serialize) dictionary to a json file
with open('db.json', 'w') as json_file:
json.dump(db, json_file)
# Simulate a situation where the in-memory dictionary disappears
del db
# Load (deserialize) dictionary from a json file
with open('db.json') as json_file:
db = json.load(json_file)
print('After persisting: ', db)