Channels ▼
RSS

Parallel

Taming Python


Gigi Sayfan specializes in cross-platform object-oriented programming in C/C++/ C#/Python/Java with emphasis on large-scale distributed systems. He is currently trying to build intelligent machines inspired by the brain at Numenta (www.numenta.com).


Python is a great dynamic language. In what way? For one thing, it allows you to add attributes to objects anytime. Consider the following class A:


class A(object):
  def __init__(self):
    self.x = 5
    
  def set_y(self, value):
    self.y = value

When you instantiate it as in:


a = A()

The __init__() is executed and the x attribute is set to 5. However, it has no y attribute just yet:


print 'a.x:', a.x

So try:


  print 'a.y:', a.y
except AttributeError:
  print "a has no 'y' attribute"
  

Output:


a.x: 5
a.y: a has no 'y' attribute

Now, if you call the set_y() method suddenly the a object grows a 'y' attribute. You can even add a 'z' attribute willy nilly from outside the class:


a.set_y(8)
print 'a.y:', a.y

a.z = 12
print 'a.z:', a.z

Output:


a.y: 8
a.z: 12

What's going on here? It's actually pretty simple. Each Python object keeps all its attributes in a special dictionary (a collection of key-value pairs) called __dict__ and anyone can access the __dict__ of any object and add/delete/modify items. Here is the __dict__ after y and z were added:


print a.__dict__
{'y': 8, 'x': 5, 'z': 12}

Note that __dict__ is not ordered by the insertion order. Python lets you remove attributes using the del statement and check for the existence of attributes using the hasattr() function. Here I verify that a has a 'z' attribute, remove it, and verify it doesn't have a 'z' attribute anymore:


>>> assert hasattr(a, 'z')
>>> del a.z
>>> assert not hasattr(a, 'z')

But, you can delete it directly by working with the __dict__. Here I check that 'y' is in the __dict__, remove it from the __dict__, and check it's not in the __dict__.


>>> assert 'y' in a.__dict__
>>> del a.__dict__['y']
>>> assert 'y' not in a.__dict__

It looks like hasattr() is not really needed, but that's not the case. Python objects have many attributes that are not in their __dict__. For example, the __dict__ itself is not in the __dict__. Python objects get many attributes from their types. For example, the __init__() and set_y() methods are also attributes that the 'a' object gets from its class A:


>>> a.set_y
<bound method A.set_y of <code.A object at 0x411730>>

>>> hasattr(a, 'set_y')
True

>>> 'set_y' in a.__dict__
False

The bottom line is that it's good to know about the __dict__, but you should use hasattr() if you want to test for the exsitence of an attribute. If you want to see all the attributes try the 'dir' function:


>>> dir(a)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__',
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', 
'__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'set_y', 'x']

In some situations this enormous flexibility becomes a burden. At Numenta where I work, we use Python objects as plugins to a runtime engine (server) that communicates with a client that may potentially be on another machine. This means that the communication goes through a special protocol and operations on the plugins such as initialization, execution and persistence are orchestrated carefully. The runtime engine itself is implemented in C++ and interacts with the Python object indirectly through a C++ wrapper. A common scenario is that the Python object creates new attributes during its execution.

But this also caused a lot of confusion and subtle bugs. Some attributes weren't available at the right time. Typos couldn't be detected easily (setting an attribute with the wrong name just created a new attribute). To make things even worse, a few generations of changes led to various attributes becoming obsolete and/or deprecated. These attributes were created on-the-fly when loading an old object from a file.

This aspect of Python became a real obstacle. The solution that emerged was that new attributes should be created only in the __init__() and __setstate__() methods. The __init()__ method is the method that's being called when a new object is instantiated for the first time. The __setstate__() method is called when an object is loaded from a "pickle" file. A pickle file is simply a built-in serialization scheme that Python supports. That allows persisting Python objects across multiple sessions of the application.

Okay, so this policy means that methods like set_y() are invalid (unless called directly or indirectly from __init__() or __setstate__()) because they create a new attribute. Also external code that tries to create new attributes like 'a.z = 12' is forbidden. If you have an attribute that you don't know what to set it to at __init__() time, just set it to 'None'.

Well, that sounds great, but just telling programmers about this magnificent new policy is not enough. Programmers are notorious for being a confused and/or belligerent bunch and even the most obedient programers can't escape typos. The Numenta software has many Python plugins, which are pretty big objects (some has 20-30 attributes) and it's not an easy task even to detect violations of policy, not to mention fixing them. What is needed is an automated approach that can enforce the policy without excpetion. This is were the lock attributes pattern comes into play. It is a pretty sophisticated solution that involves intercepting attribute setting and checking for violations. When a violation is detected an exception is raised.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video