E pur si muove

Finding memory leaks in Python extension modules

Saturday, December 13, 2008

Valgrind is amazing, even if you've never used it before it's basic usage is really simple:

$ valgrind --tool=memcheck --leak-check=full /usr/bin/python ./

This is enough to point you at the places in your extension module where you allocated stuff that didn't get freed later. This is a massive timesaver with looking over the entire source file again to find out where you made your mistakes.

I must admit that the extension module in question uses malloc(3) and free(3) directly instead of allocating on the Python heap using PyMem_Malloc() and PyMem_Free(), so I don't know if that would make it harder to find the leaks. I can imagine that in that case the "blocks definitely lost" list might point to somewhere in Python's source files instead of your own source files, but I don't know.

gcc, LD_RUN_PATH & Solaris

Thursday, December 11, 2008

I actually started writing a long post about how linking agains shared libraries works and what and how does, but that seemed to get very long. So here the short version that assumes pre-existing knowledge.

If you want an RPATH in your DSOs (dynamic shared objects: executables and shared libraries) you can pass in -R/-rpath to the linker (the runtime_library_dir keyword argument to distutils's Extension class). This is not always very practical though, e.g. when building Python it would be a major pain to modify all the makefiles and Error prone too.

So the linkers (both GNU and Solaris) also accept an environment variable: LD_RUN_PATH. When that is set it is used to populate the RPATH field, but only when no explicit -R/-rpath is specified. So far so good.

On Solaris however your gcc will usually not be installed in the root system, rather in /usr/sfw (sun supplied) or /opt/csw/gccX (opencsw). So the gcc libs will also not be in the default library search path. But gcc is nice and helps you out, it will implicitly add a -R option to ld pointing to it's own library directory (for libgcc). Now I'm not sure how nice exactly this is since it screws your LD_RUN_PATH environment variable over and you actually need to run gcc traced to see this happening, have fun finding that! It would be nice if gcc would extend the environment variable if you had it set but where using no -R flags instead. Oh well, at least now you know.

Mocking away the .__init__() method

Saturday, December 06, 2008

I wanted to test a method on a class that didn't really depend on a lot of other stuff off the class. Or rather what it did depend on I had already turned into Mock objects with appropriate .return_values and asserting with .called etc. Problem was that the .__init__() method of the object invoked about half the application framework (option parsing, configuration file loading, setting up logging etc.). Firstly I don't really feel like testing all of that (hey, these are called unittests after all and those fuctionalities have their own!) and secondly then I had to worry about way too much, quite a lot to setup.

That's how the whacky idea of replacing the .__init__() method with a mock occurred to me:

class TestSomething(object):
    def test_method(self):
        inst = module.Klass()
        inst.other_method = mock.Mock()
        inst._Klass__log = mock.Mock()
        # more of this
        assert inst.other_method.called

The ugly side effects of deciding to mock away the .__init__() method like this is that I have to create mocks for more internal stuff. The one shown for exampls is normally provided by self.__log = logging.getLogger('foo').

I must admit that I'm still trying to find my way in how to use the mock module effectively and hence I'm not really sure how sane this approach is. One of my objections with this is that not only am I meddling with clearly hidden attributes of the class, but I also have to do this again and again for each test method. So the next revision (I'm using py.test as testing framework here btw):

class TestSomething(object):
    def setup_class(cls):
        cls._original_init_method = module.Klass.__init__
        module.Klass.__init__ = mock.Mock(return_value=None)

    def teardown_class(cls):
        module.Klass.__init__ = cls._original_init_method

    def setup_method(self, method):
        self.inst = module.Klass()
        self.inst._Klass__log = mock.Mock()
        # more of this

    def test_method(self):
        self.inst.other_method = mock.Mock()
        assert self.inst.other_method.called

This is actually workable and I'm testing what I want to test in a pretty isolated way. I'm still wondering whether I've gone insane or not tough. Is it reasonable to replace .__init__() by mock objects? Have other people done this?

Subscribe to: Posts (Atom)