devork

E pur si muove

Python optimisation has surprising side effects

Tuesday, April 27, 2010

Here's something that surprised me:

a = None
def f():
    b = a
    return b
def g():
    b = a
    a = 'foo'
    return b

While f() is perfectly fine, g() raises an UnboundLocalError. This is because Python optimises access to local variables using the LOAD_FAST/STORE_FAST opcode, you can easily see why this is looking at the code objects of those functions:

>>> f.__code__ .co_names
()
>>> f.__code__ .co_varnames 
('a', 'b')
>>> g.__code__ .co_names
('a',)
>>> g.__code__ .co_varnames 
('b',)

I actually found out this difference thanks to finally watching the Optimizations And Micro-Optimizations In CPython talk by Larry Hastings from PyCon 2010. I never realised that you could create a situation where the nonlocal scope would not be looked in.

5 comments:

babui said...

The problem is that in g a in rebound inside the function so Python considers a a local variable during function's evaluation. When evaluating b=a, a has no local binding yen, so it fails.

Antoine P. said...

This is not because of optimizations, this is by design. Your code is simply wrong. All Python implementations (should) raise the same error on this code.

Floris Bruynooghe said...

I'm not saying this is a problem nor am I saying I want to write code like this, in fact I've never encountered this in real code. I just found it an interesting gotcha: from just looking at the code you'd expect both to work (but g() would do a meaningless local assignment), you need to know the actual generated code to understand why this isn't so.

And I know this is by design, but the design was to optimise access to local variables, rather then look in several dicts each time. Which is why I called it an optimisation (AIUI this wasn't always done).

Masklinn said...

> And I know this is by design, but the design was to optimise access to local variables, rather then look in several dicts each time. Which is why I called it an optimisation (AIUI this wasn't always done).

I don't think that's true. Originally, the scopes were completely split (between global and local) and it simply has remained that way ever since. It's not an issue of "optimization", it's just an issue of Python's scoping being broken and (as in Javascript) Python essentially hoisting variable declarations to the top of their scope (so the local "a" variable is pretty much just defined at the start of "g", before "b = a" is even performed)

New comments are not allowed.

Subscribe to: Post Comments (Atom)