I’ve implemented a rudimentary persistent object store in Python. It is implemented as a Python extension module that implements a set of persistent types: strings, integers, floats, lists, and dictionaries. Each of these are backed by the persistent object store, which is implemented using a memory-mapped file.
In addition, using a specially crafted Python base class for Python objects, Python objects may be stored in the object store (as dictionaries) and instantiated out of the object store.
The result is an persistent object graph (the root of which is a persistent dictionary) whose objects and attributes may be manipulated in-place using native Python syntax. Rudimentary locking is provided so that multiple Python threads / processes may concurrently manipulate the object store.
Some aspects of this system are:
It is a Python extension module written in C and C++.
It is tested on Linux. It will likely work on *BSD systems, though it is possible that the location of the mapped storage may need to be moved.
It is implemented in a hierarchical manner:
A page manager handles the allocation of 4kB pages within an mmaped file. It is multi-process safe. It is, in a sense, a glorified sbrk for an mmaped file.
A heap manager abstracts the page manager’s services to manage the allocation and deallocation of arbitrary-sized storage segments within the mmaped file. It is essentially a malloc and free for an mmaped file. This is also multi-process safe.
An object manager manages five new base types (persistent int, float, string, list, and dictionary) backed by persistent storage, using the heap manager’s services. It also provides rudimentary locking facilities for concurrency-safeness.
The persist Python extension uses the object manager’s services to implement persistent types that mimic the equivalent Python types. Additionally, it has the “smarts” to reinstantiate a Python object that was stored as a dictionary (using the appropriate Python base class). The object manager’s locking facilities are made available for application usage.
Only one file may be mapped at a time (because it is mapped to a fixed logical address).
It is available for use under the MIT license. Contact me if you are interested in using it.
Some examples of its use are a simple counter:
import persist root = persist.root() root.lockExcl() try : root[’x’] += 1 # Increment counter except : root[’x’] = 1 # First pass; initialize print “Content-type: text/html\n” print “<p>You are visitor “ + str(root[’x’]) + “ to visit this site!</p>” root.unlock()
and rudimentary objects:
import persist
from pbase import pbase
class person (pbase) :
def __init__(self, name = “”, age = 0) :
pbase.__init__(self)
self.name = name
self.age = age
def printAge(self) :
print “<p>” + self.name + “ is “ + str(self.age) + “ years old</p>”
root = persist.root()
root.lockExcl()
if not root.has_key(’Joe’) : # First time through
root[’Joe’] = person(’Joe’, 27)
if not root.has_key(’John’) : # First time through
root[’John’] = person(”John”, 29)
# On subsequent passes we will retrieve the objects stored on the first pass.
print “Content-type: text/html\n”
root[’Joe’].printAge()
root[’John’].printAge()
root.unlock()
You might also be interested in the PyPerSyst project, whose goals are to implement the prevayler concept for Python. Prevalence is in some ways similar to my scheme above, but it tends to value storing objects in memory rather than in memory-mapped files. As a result, it is better for server applications, whereas my approach is better for short-running scripts to generate web pages. Prevalence also tends to incorporate some elements of transaction processing, whereas my system is nothing more than a simple object graph.