February 25, 2006

Implementing IronPython

Again another full talk, this time from Jim Hugunin, author of both Jython and IronPython. Jim summarised the state of the CLR, pointing out that it's now a standard. Jim enjoys seeing the IronPython download link on a web page with the Visual Studio icon on it.

The last year has been spent on ensuring that IronPython can pass as many as possible of the standard Python regression tests. Certain tests have been modified, because they test implementation optimizations from CPython. There are some flat differences (like IronPython doesn't use refcounting). Some are arguable, such as the use of Unicode attribute names, which people do expect to be able to do in the CLR world.

The structure of the IronPython compiler is now (since CPython's AST branch was implemented) closely aligned with the CPython compiler, but of course it produces IL operations for the CLR rather than the Python bytecodes of CPython. Python bytecodes are simpler than IL bytecodes, byt there is a significant speed gan from the CLR's use of JIT compilation into machine code. Not every operation can run 100 times as fast on IronPython, but some operations become a single machine-code instruction. Jim wishes he could get that speedup for all of IronPython.

He demonstrated some of the tools you can use to examine the generated code, unfortunately without identifying them. It seems relatively straightforward, which will surely be helpful to other implementers and more advanced users. Rather than keep a complicated line number table, no-ops are inserted into the code. Jim contrasted his code generator with that of the C implementation, and I was impressed with the simplicity of the code generation code.

Jim wanted to be able to claim that IronPython used a "simpler" object layout than CPython - you might expect that because no reference count is required in a true grabage collection environment. Unfortunately they have had to give each object a 32-bit "synch block" that holds arbitrary data to assist various implementation features, so integers are 12 bytes (presumably for 2-bit machines) in both systems. List representations are remarkably similar, but IronPython does seem to save a little space there.

The type pointer is different for IronPython, in an attempt to maintain performance. The first sensible way to proceed is to wrap every external object with a subclass of some sort of Python PyObject (this approach was taken by Jython). This requires a lot of wrappping and unwrapping during method calls. The other way, used in IronPython, is to used a "pure object model", allowing the underlying objects' methods to be called directly and the returned values to be returned directly.

Subtyping a builtin class permits changing the type of an instance by changing the instance's __class__ slot, but the type as perceived by the CLR cannot change.

Fields and proerties of CLR objects are closely parallel to descriptors. Overloaded methods analyse their arguments to disambiguate their signatures. Constructors are converted to __new__(). And so on (Jim was goung fast here). The Python object model turns out to be surprisingly convenient for implementation on CLR, which Jim felt was a tribute to Guido's design skills.

Jim demonstrated an interactive session in which he created a window in the presentation framework. He then defined a CallMe class and bound an instance of the class to a button's click event. A panel subclass was defined, and an instance rendered. Jim's bravery in performing live reali-time coding is much to be admired, and says a lto about his confidence and knowledge. Overall a very convincing demonstration that IronPython brings true Pythonicity to the CLR environment.

Jim closed with an interesting discussion about compatibility. Should IronPython expose CLR native object methods as methods of Python objects? Both sides, when Jim talked to them, felt that their conflicting opinions were the only obvious answer to the question! His attempt to please the CLR folks has led to the development of a clr module. import clr changes method call semantics change to expose the CLR methods. Jim had a lot of pushback from the CLR world not to use the from __future__ notation for this feature. The fortunate difference in naming conventions between the two environments means that there are very few conflicts.

Exception handling is another are where the two environments conflict. There are two class hierarchies to be resolved. If you try to catch a .NET exception then the corresponding Pythnon exception will be converted.

Wow! Now I have to get lunch ...

No comments: