E pur si muove

Singletons in a Python extension module

Tuesday, June 23, 2009

If you want a singleton in a C extension module for CPython you basically have to do the same as when doing this in plain Python: the .__new__() method needs to return the same object each time it is called. Implementing this has a few catches though.

static PyObject *
MyType_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
    static MyObject *self = NULL;

    if (self == NULL)
        self = (MyObject *)type->tp_alloc(type, 0);
    return (PyObject *)self;

Then assign this function to the tp_new slot in the type. There's two things of interest in this function:

  • self is declared as static. Normally this would not be the case, but declaring it as static makes it stay alive after the function has returned and it will still be pointing to the first instance of our object, so the next time a new object is asked for this is simply returned.
  • Before returning the pointer to the static self, the reference count is increased. This may seem odd but is the right thing to do, otherwise the reference count will not go up for subsequent calls to the new function since the reference count is increased by PyType_GenericAlloc() (via the call from the pointer in the tp_alloc slot). So if you don't do this you end up with negative reference counts, which doesn't make python very happy. This does mean you never end up deallocating the object since the lowest reference count you have is 1, but you wanted a singleton right?. If you really wanted to get the reference count to drop to 0 then you can always put the Py_INCREF() in an else clause.


Patricio said...

You can also use the atexit() function to delete your object when the process is exiting. The only pitfall is that you have to make sure that Py_Finalize() is not called before you decref your object. This is important because plugins that link against python may choose to de-initialize python from atexit because python should be deinitialized after the last instance is deleted, and at current the python interpreter can't be trusted to be initialized and deinitialized more than once.

This should be in the standard docs for write extension modules.

Graham Dumpleton said...

You can't use the C atexit() call as it is pretty well always guaranteed to be called after Py_Finalize(). The only time it wouldn't is when Py_Finalize() isn't actually being called on process shutdown.

Python has its own atexit module for which the callbacks are triggered as part of interpreter shutdown within Py_Finalize(), but that only works for callbacks registered in main interpreter and not any in sub interpreter.

Use of sub interpreters is a bit moot anyway as the original C code isn't safe for use in sub interpreters anyway. This is because the C static is seen by all sub interpreters. Result is that you can get dangerous reuse of Python objects created in one sub interpreter, within context of another.

About the only safe way of doing this which ensures cleanup on interpreter shutdown and which works for multiple interpreters within a process is for the C extension module to have a corresponding Python module, with the singleton object stored it in instead. This works because each sub interpreter will have its own copy of that Python module, whereas C module parts are effectively shared.

Note that Python 3.0 cleans this up a bit with better ways of tracking per interpreter data in C context and ensuring it is destroyed when the interpreter the C extension module is used in is destroyed.

New comments are not allowed.

Subscribe to: Post Comments (Atom)