devork

E pur si muove

weakref and circular references: should I really care?

Wednesday, May 12, 2010

While Python has a garbage collector pretty much whenever circular references are touched upon it is advised to use weak references or otherwise break the cycle. But should we really care? I'd like not to, it seem like something the platfrom (python vm) should just provide for me. Are all those mentions really just for the cases where you can't (or don't want to, e.g. embedded) use the normal garbage collector?

Update: By some strange brain failure I seemed to have written "imports" rather then "references" in the title originally. They are obviously a bad thing.

3 comments:

Unknown said...

In my experience circular import is mostly a problem of actual code breakage:

A.py:
import B
def x():
pass

B.py:
import A
A.x()

The call A.x() fails b/c A is not yet fully loaded. It would work, however, if "def x()" were above "import B" in module A.py.

Unless you reload or unload modules, there's no real issue w/ the references being circular for modules. They just persist for the life of the process.

As far as other circular object refs it may boil down to whether or not you write long-running code (e.g. GUI app or server) or very memory intensive code where those may lead to memory usage that matters. In a lot of code, this just is not the case. When it is, you may need to pay some attention to object life cycle and weakrefs can be a useful tool for that.

Note that writing long-running code does not automatically mean you need to worry about this. If I remember correctly (and I probably am ;-) Python's garbage collector can identify disconnected graphs of objects that are no longer referenced from the outside and blow them away. This makes it a lot easier to make sure that you've identified objects as trash for the collector.

Even so, there is sometimes value in explicitly destroying objects in some cases, which you can do fairly effectively with a method destroy() that does self.__dict__.clear() (and possibly other things first like closing files). This can be more manageable than waiting for GC to happen at some uncertain time in the future and may in some cases be easier than using weakrefs.

So I think the bottom line is that in many types of code you don't need to care, but there are cases where you do need to pay attention to object life cycle in one way or another.

Unknown said...

Firstly: sorry about the imports thing, I didn't mean to write that. The problems of having circular imports is obvious.

On the normal circular references thing, I'm tempted to agree that the gc will properly detect cycles and clean them up eventually. But documentation of e.g. xml.minidom seem to suggest an explicit .unlink() option (like you suggest). The stdlib goes out of the way using weak references in other places too.

I guess a library needs to be kinder then an application for this and hence try harder to avoid creating cycles or provide ways to break it so that the application writer can decide to let the gc do it's work or break it manually.

Michael Foord said...

Well, yes - it would be nice if you didn't have to worry about them.

In general you *don't* have to worry about (except that garbage collection is now non-deterministic - but this is the case for all *good* implementations of Python anyway ;-)

The problem comes when you have cycles involving objects with __del__ methods. Python doesn't know which order the __del__ method should be called in (the __del__ methods *could* reference objects that have already be cleaned up) and so it doesn't even attempt to clear the cycle. This means you can have uncollectable cycles and leak memory.

If you aren't using objects with __del__ then don't worry about it.

PyPy, and I assume also Jython and IronPython that both use the garbage collection mechanisms of their underlying platforms, *will* collect cycles like this by arbitrarily breaking the cycles.

Avoid creating cycles is one of those 'good practises' that is not a hard rule.

New comments are not allowed.

Subscribe to: Post Comments (Atom)