December 30, 2006

The SuperHero Test

I saw this on Allessandro Iob's blog. I too am Iron Man. Maybe I should start using IronPython?

Your results:
You are Iron Man

Iron Man
The Flash
Wonder Woman
Green Lantern
Inventor. Businessman. Genius.

Click here to take the "Which Superhero am I?" quiz...


I see Blogger have released their beta developments into production. I can't help wondering if that's why my previous posting doesn't seem to have made it through RSS feeds and into the places it usually appears. It's not every day something so noteworthy happens.

December 29, 2006


Well, thanks to coralpoetry for pointing out this has become a blog of note, presumably the last of 2006. I suppose this might be more humbling if there were published criteria for what makes a blog noteworthy, or if I had been aware of this feature before it linked to ... for Some Value of Magic. Another ten seconds of my fifteen minutes of fame?

Acknowledgements of any sort are encouraging, though. Like the old saying about bad publicity, for a minor blog there's no such thing as unwelcome atttention! So many thanks to whoever at Blogger (Google) put me on that list and to anyone else who happens to read this blog.

Because of the linkage I suspect that a small spike in readership is likely to occur, so let me take this opportunity to wish all fellow-members of the blogosphere (dreadful word) and all Holden Web customers a happy and prosperous new year. 2006 (and especially the latter part of it) has been somewhat traumatic for various reasons, so here's hoping for a smoother 2007.

Finally, while I have your attention, let me point you to Pass IT On, my favourite charity. Martha and Sandy's dedication to their goal of providing computer equipment to the disabled is single-minded, and I wish I could do more to help them. If Santa Claus has brought you a new computer please consider donating your old one. They also need cash support to help them become a free-standing organisation this year, so you can give them money if you don't have a computer to spare. Good luck to all at Pass IT On.

December 18, 2006

ZFS: "Probably more abuse in 20 seconds than you'd see in a lifetime"

I've been reading (thanks to Brett Cannon) about the Zettabyte File System (ZFS) - an open source project from Sun Microsystems that seems to offer real opportunities to take proper advantage of both SAN and virtualization technologies. If you want to know what a zettabyte is then as usual the Wikipedia is reliable on technical matters (and no, I had never run across the term before).

Brett was simply happy that Apple announced the inclusion of ZFS into the Leopard (OS X 10.5) distribution. I was interested because Sun developers were taking an interest in advanced filesystems even back when I used to work there - they produced the Translucent File System (TFS), which was a copy-on-write mechanism to allow users to share a common source pool and keep their own changes.

One of the problems with TFS was its need for kernel integration (in those days even the windowing system was integrated into the kernel). Another issue arose when a user base their TFS on another TFS which was based on a third TFS which ... effectively it implemented an copy-on-write inheritance mechanism, but efficiency could drop quite rapidly with multiple layers.

Unlike TFS, ZFS can run in either kernel or user mode, and the nightly testing appears to twist the filesystem in every possible direction simultaneously (hence the title of this blog entry). Here's a list of what goes on in parallel in each nightly test:

? Read, write, create, and delete files and directories
? Create and destroy entire filesystems and storage pools
? Turn compression on and off (while filesystem is active)
? Change checksum algorithm (while filesystem is active)
? Add and remove devices (while pool is active)
? Change I/O caching and scheduling policies (while pool is active)
? Scribble random garbage on one side of live mirror to test self-healing data
? Force violent crashes to simulate power loss, then verify pool integrity

So, it would appear that ZFS is a worthy successor to TFS and much more besides, as you will find out if you too read the slide show. I haven't looked at the license yet, but it does appear that under Johnathan Schwartz's direction Sun might be beginning to find its way in the open source world of the 21st century, despite the amusing graphic that Eric Bangeman produced when Scott McNealy resigned. Maybe I should buy some stock again. Maybe we all should.

December 12, 2006

Lies, Damned Lies and Baggage Claim Call Centres

I travel quite a lot, and up until now I have been lucky. Bags have been lost from time to time, but they have always made their way back to me, usually within 24 hours. I understand that bags can get lost occasionally, but the frustrating thing last time this happened was the complete absence of any reliable information.

I flew from Edinburgh to London Heathrow, then onwards to Washington Dulles where I was incorrectly instructed to enter the USA through the transit lounge. Since my flight to New York's La Guardia was on a separate reservation and my bag had only been checked through to Washington this put me in the wrong place to reclaim it, and trriggered what I would call a comedy of errors if there was anything the least bit funny about it. My bag and I were both in Dulles, but United were unable to reunite us. I was initially assured that the bag would join me on the flight to La Guardia but, inevitably, I arrived and it didn't.

So I registered a claim and went to my accommodation. It took forty-eight hours and ten 'phone calls before I was finally able to wear the clothes I had brought with me. For most of that time United's web site was telling me that the bag had been found and that it would be sent to New York on a non-existent flight shortly before I landed at Washington!

Talking to other travellers, the general impression is that baggage "help" lines are uniformly bad. My own experience was certainly terrible: I could easily have believe that there was no computer-based information system at all behind the scenes, as the different people I spoke to gave me completely different and inconsistent stories each time I called. The responses almost seemed calculated to inflame. I was particularly incensed by a conversation with one woman who, when I called back as requested after waiting two hours, insisted that I had only called thirty minutes previously, and that I should wait until next morning before calling again. It's not usually considered good customer service to call the customer a liar. The different stories I was told about my luggage might as well have been chosen at random.

Under these circumstances the airlines would do better to dispense with the call centres completely and replace them with a recording saying "Our staff are working to restore your lost bag. Thank you for your patience". This would be less irritating than the nonsense I had to put up with, would have taken up far less of my time, and would hopefully reduce costs and allow the airlines to invest a bit more in relocating and delivering lost bags.

And now, if you'll excuse me, I have to send a claim for reimbursement to United.

December 1, 2006

Blogger on a Nokia 770

Well, I find myself with only a Nokia 770 Internet tablet to access the 'Net, so I thought I'd see how easy it was to create a blog entry. The typing is less than terrific through the hunt-and-peck on-screen keyboard, but so far I haven't been able to get my snazzy laser-projection VKB virtual keyboard to work reliably with the Nokia, so I have to be satisfied with the standard mechanism..

One rather strange feature of the interface seen through the Nokia is the invisibility of the title field. I see in the postings list that I've somehow managed to call it JU, but I really can't see how to change it. Maybe I'll change it later.

I seem to remember that the AJAX functionality used to work, but that's not the case since I upgraded the operating system. Maybe there's some setting that I need to update, but if so it eludes me. Anyway, this has taken quite long enough so I'll save the post as a draft and put the title right before I post it.

Browser Statistics (Hello, Robots)

During the current statistics trawl I started to wonder which browser technologies dominated among the visitors to the Holden Web site. November was an interesting month because both Microsoft and the Mozilla team released new products.

Here is typical information for the top ten client programs. Unlike previous statistics, Firefox users account for almost a half of the traffic to the site - usually Internet Explorer leads the pack with about 30%, but in November it was relegated to third position.

Internet Explorer 7.0 currently accounts for 2% of traffic, the first time it's appeared in the stats. The thing I found most surprising was the fact that more Firefox users were using the recently-released 2.0 version than anything else. 2% of users still use Firefox 1.0.

Rather less delightful is the fact that 20% of site traffic is spidering (though how come Yahoo has to crawl so much more aggressively than anybody else is beyond me). Perhaps the real message is that I need to put some more compelling content on the site!

A Tip o' the Hat to Ron Stephens

Looking at my web site statistics the other day I discovered, as you might expect, that the majority of referrals were coming from Google. However the notable second source is Ron Stephens' Awaretek site. Ron has been quietly generating Python reference materials and producing a Python podcast for years now, with vary little in the way of public acknowledgement. So thanks, Ron, keep up the good work!

Database API Tutorial at PyCon TX 2007

The list of PyCon tutorial offerings has now been finalised, and I'm happy to say that my Database API proposal has been accepted. Once again there is a great line-up of tutorials. It must have been hard for the program committee to choose this year's line-up since most of the offerings were strong and from well-respected speakers.

This doesn't guarantee that any tutorial will run: there have to be enough sign-ups for that to happen. This year there is also a tutorial on SQLAlchemy, and it looks as though the schedule will allow people wnating to understand how to use databases in Python to go through the DB API in the morning and then learn SQL Alchemy in the afternoon. I'll be conferring with Jonathan Ellis to make sure that I cover any necessary background for those who want to attend both.

I'm hoping that my tutorial will be well-received — last year's filled the room, and helped me fund my attendance at PyCon. It's always good to have guidance from students in advance telling me what their "hot-button" items are. If you're thinking of attending you can use this page to provide guidance on the content you'd like to see.

November 24, 2006

A False Sense of Security?

Although I am primarily a technical person I have broader interests than just that, including business and marketing. Sometimes these interests intersect in areas that affect a broader cross-section of the population. This is not a technical post.

One of my sources of occasional reading is Seth Godin, author of several books including All Marketers are Liars. He also runs a personal blog, and in the wake of the Thanksgiving rush he wrote this TSA-inspired post about the stupidity of the air transport security rules and the insensitive way in which they are administered. This really rang bells with me.

Godin courageously risks being thought a whiner by ridiculing the rules that allow a full three-ounce container of gel through the screening but disallow a five-ounce container with only one ounce in it. I've spent a lot of time in US airport security lines, and people aren't generally welcoming of critical remarks. Unfortunately many of the "security" activities that we have to go through are what Bruce Schneier calls security theatre.

At the same time as the TSA are forcing even the airline staff to submit to security screenings there are airports where construction staff are allowed airside simply by swiping an identification card. So the terrorists will be targeting the construction workers, not trying to smuggle six ounces of gel through the passenger channels.

Now I know I'm not the only person to suggest that the US government's initial response to the events of 9/11 was off-target. Benjamin Franklin was spot on in saying "Those who sacrifice freedom for security deserve neither". It seems that the majority of American citizens appear to believe that the government has duty to protect them from all risk from the cradle to the grave.

Said citizens, of course, are oblivious to the fact that it is impossible to eliminate risk completely: the aim of security is to raise the cost of a breach to a level that makes the breach uneconomic. But enough is enough. Dulles International has got to the stage where "going through security" can add over an hour to the passenger processing time, and I'm sure others are as bad if not worse. If things go on the way they are, people will stop flying because of the security measures.

What percentage of the American population use the air transport system? How much is being spent every day on airport security theatre, and how many lives is it saving? How much should we be prepared to spend to save a single life? How quickly could we have fresh drinking water available for all of humanity if we spent the money towards that goal instead? How many lives would that save? Which expenditure would make the USA more secure? Isn't it perhaps time for a reassessment of priorities?

November 20, 2006

Movie Moguls Need Control

In yet another example of movie mogul insanity the MPAA, in the guise of Paramount Pictures, is suing a small business that is ripping DVDs to iPods for customers who buy both at the same time. This highlights exactly why the Digital Millenium Copywrong Act was a bad idea, and it shows American politics up exactly at its worst: as a way for wealthy influence groups to get their own way at the expense of the buying public who have made them rich.

Don't even get me started on software patents.

Corporate governance always makes such a huge fuss about how corporations are really responsible to their stockholders. Who are the stockholders of the MPAA members, and why aren't they telling their managements to start respecting the customers who earn them their living?

November 19, 2006

Thunderbird QA Not Slipping!

Well, for my sins I have become a Thunderbird 3.0a1 user, and I can't say I'm happy. Several times the program has threatened to update itself, and eventually the promised (threatened?) upgrade arrived. [NOTE: apparently the 3-series builds are intended only for developers].

The error that originally caused me to update to the latest version is even worse in 3.0a1. Message contents are now being mangled on a regular basis. I have no idea whether they can be recovered correctly or not, and I am beginniing to lose confidence in what has until now been a reasonably reliable piece of software. It's ironic that I am suffering this problem even worse than I was because I wanted to help the project by making an accurate bug report.

If many other users are suffering the same problems I am then this issue needs to be addressed before it starts to give open source a bad name.

November 8, 2006

Thunderbird: Can't Help for Hurting

I've been an enthusiastic Thunderbird user for a fairly long time, and have only noticed odd quirks from time to time. One such quirk is the fact that occasionally a newsgroup article will show completely the wrong content: like the content of some other article altogether. This happened again today, and my first thought was to report the bug.

Now I've reported bugs in the past, and been told that they were duplicates of existing reports. So this time I decided I'd do the right thing and follow the reporting protocol. After all the guys who put Thunderbird together deserve some consideration, and I'm sure many users are cluelessly inconsiderate of the fact they are one amongst milliions and that they haven't paid a cent for some pretty high-quality software. I can't tell whether there is any description of such a bug in Bugzilla -- it might be bug 239665 or it might not.

It turns out that one if the things you are supposed to do is install the latest nightly download in a blank folder with a new profile and see if the same error occurs. This is not a prospect I relish as it will involve a certain amount of setup, but I do the download (nothing much else to do in my hotel room) and start the install. Unfortunately it turns out there's an error in the install, and I see a dialog box showing
Microsoft Visual C++ Runtime Library error R6034: An application has made an attempt to load the C rutime library incorrectly. Please contact the application's support teamfor more information.
Lucky me: I thought I had one problem, but now I have two. I feel duty bound to report this new error under the "Installer" category. Fortunately the installer completes, and I can run the new Thunderbird to confirm the same error occurs (though I stupidlu omit to bring my RSS feeds forward into my new version). So now I can go ahead and add to the bug report for the original problem, since unfortunately it hasn't gone away. My motivation is severely diminished, however, so this may not actually happen. At least I reported the installer problem.

One step forward, at least only one step back.

November 3, 2006

Stackless Python Continues to Amaze

I recently started tracking the Stackless Python mailing list. I've been interested in Stackless, the brainchild of the fearsomely clever Christian Tismer, for some time now. I recently acquired a client with active interests in that area, so it behooves me to stay up to date.

Stackless allows you to organise your work into tasklets (which as far as I understand it have replaced the original microthreads as the unit for scheduling: this is all pretty new to me, so apologies if it's incorrect). An interesting thing about tasklets is that you can pickle them, pass the pickle to another computer, unpickle them and resume the tasklet in the new environment.

My attention was grabbed by one recent thread in which a user explains that he was trying to perform such interactions across a heterogeneous environment with Power PC, SPARC and Intel-based CPUs and found that it worked between SPARC and PPC, but the Intel architecture appeared to be failing -- presumably due to the different endianness.

After a couple of messages surmising that this was hardly surprising, Christian came into the thread to explain that there was no architecture-dependent code in the pickled thread resumption functionality. Lo and behold, the original poster came back explaining:
I must apologize, it turns out my problem was due to the fact that I was using a different version of stackless on the intel machine. In case anyone is curious, resuming pickled tasklets across architectures is easy. Thank you for your replies!
Let me repeat that, just in case you missed it: resuming pickled tasklets across architectures is easy. As far as I can see this gives Python an amazing capability to produce applications with highly-distributed architectures. It's going to be interesting to see where Stackless goes with this, but it should be shouted from the rooftops. This is an advanced feature that represents a real advantage for (Stackless) Python in a world where everyone is wondering how they can accommodate the new multicore processors.

With Stackless, your CPUs don't even need to have the same architecture, but the feature will work just as well in the homogeneous environment that multi-processor computers provide.

October 24, 2006

What, Me, Unique?

Hardly, as the box below shows. But ..
LogoThere are:
people with my name
in the U.S.A.

How many have your name?

... just the same none of those others have
  • Fathered my son (hi, Simon!)
  • Chaired three PyCons
  • Written a book about Python
  • Taught almost 3,500 students about networking and security topics
  • Helped comp.lang.python readers to use Python
So there are lots of things that distinguish me from the doubtless equally worthy 47 others. It's always interesting to see the uses to which the web will put information. Twenty years ago would only have been imaginable as an online application. Now it just merits a raised eyebrow and a passing "good idea". Who knows what tomorrow's kids will take for granted assuming the preceding generations don't destroy the world.

Oh well, that's enough cheerful for today.

September 30, 2006

Thunderbird Politically Incorrect?

Someone just posted a rude but amusing remark about a regular comp.lang.python contributor, so I replied off-list with a snigger. Imagine my surprise when Thunderbird's spilling chucker failed to recognise the word and suggested a change to "nigger"!

Nice to be back on the blog. I've been working my ass off lately.

July 3, 2006

Moin Moin - Introduction for Developers (Thomas Waldmann, Alexander Schremmer)

MoinMoin was initially a one-man project, but the original author is less frequently available now and there is a core team that is collectively responsible for the code. The project has a large user base, which grew significantly when Unicode processing was introduced and enabled internationalisation.

For the rest of this presentation you should see the published slides, which unfortunately not only accurately represent the content of the talk but also encapsulate its presentation. I suspect (though I do not know them) that the speakers were perhaps nervous, and not therefore confident enough to talk ad lib in a foreign language to what must sometimes seem like an intimidating audience.

I found this a little frustrating since the few departures from reading the slides that the presenters allowed themselves made it clear that they both had an excellent grasp of a very popular Python web application. With more time at the conference I know I would have liked to talk to them both about possible applications and developments.

Literate Testing with Doctest (Marius Gedminas)

Marius started out by outlining the reasonable uses of doctest, both inline in programs and in files that are separate from the source code, and advised us not, for example, to put complicated setup or tests for obscure corner cases in the README. DocTest can be easily integrated with unittest, by defining a test suite that runs the doctests!

He then pointed out DocTestSuite and DocFileSuite for externalising the tests, and considered how to handle complicated setup. One way is to define setup and teardown functions, and you can also put code in test.globs, but these methods should be limited for ease of comprehension.

The advantage of doctest is in encouraging simlicity and interspersed narrative to explain the tests. Another option is to define a test module. The advice is to avoid mixing long tests and regular code in the same file.

In real life, of course, some features of doctest are difficult to use when data is not reliably repeatable (such as the addresses of objects in repr()-style representations). Also problematic are ellipsis-elided outputs, which are to be ignored, and ordering issues. Web testing is even more difficult, as responses are frequently state-dependent. marcus showed some examples of testbrowser, a simple system allowing web testing.

Doctest disadvantages include the disability to step through tests. Documentation (docstrings, README files) should be separated from unit and functional tests.

Questioning revealed that Zope uses a slightly different version of doctest from the Python 2.4 distribution, although "in theory" they offer the same functionality. Jim Fulton asked whether the speaker had looked at Fit (and apprently PyFit is available) but although Jim would like to see Fit and doctest integrated there was no experience with the packages in the room.

The only slightly problematic aspect of this presentation was a couple of times when the speaker made inaudible responses to inaudible questions from the chair. This would have been less troublesome if I had not been at the back of the room, or if the PA system had allowed audibility. Otherwise I got a lot of information in a very short time, well presented.

Working Together on the Web (Kevin Dangoor)

My whole EuroPython experience was a little disjointed, starting with arrival. I and one other delegate ended up on a bus that went, not to CERN but to a local railway station three or four kilometers away. The result was a late arrival, so by the time I took my place in this session Kevin was just starting to present his first example, how to adapt a simple application to the WSGI interface.

Using simple_app was the easiest, but Kevin also showed us how to do it by defining our own class. He then considered the problem of maintaining session state (though Kevin says he prefers not to use state).

WSGI is also helpful in such situations: its concept of middleware (which talks to the web server on one side and a web application on the other) can help. The example here was of a "Latinator" that translated into pig Latin. Kevin showed us how to define a class with a method that can be passed to the WSGI framework as the start_response argument. WSGI is good because there have been many contributions to the PyPI index.

Kevin next introduced us the the Paste system designed by Ian Bicking. This contains a confusing number of packages and modules, which makes it hard to describe and hard to climb the learning curve and start using it. He suggested this demonstrated that even in the Open Source world we are faced with the "build vs. buy" choice, even though "buy" is really an investment in learning someone else's package and build, as usual, is learning and understanding the base technologies you want to implement, and writing your own implementation.

In TurboGears Kevin made the choice to use as much existing technology as was practicable.

The fictitious web package author might decide to package up other dependencies, but this can cause problems for people with other versions of the components she chooses to bundle. This is the classic version dependency. The setuptools system has defined a cross-platform format for installation of Python packages, the .egg format. This can also be used to install packages with binary components, and is proving to be very usable.

The .egg format allows other systems to read the metadata about the package, which is useful in many environments where plugins are used, even in non-Python environments such as Eclipse. The discussion about how entry points were defined to plugins didn't seem terribly easy to understand to me - maybe I was just distracted by the extraneous materials, but I didn't really see the point.

Kevin closed by talking about Beaker, starting by showing us theat it depended on the MyghtyUtils package and defined two paste entry points. The examples at this stage started to leave me wishing I had attended the AJAX session instead, but this could just be me: others might have found the material relevant to their interests.

This presentation relied for contrast on rather too much use of "humourous" slides: a picture of an egg when talking about eggs, a picture of an RJ45 plug in a socket when talking about plugins. I have no objection to such devices, but they should either enhance understanding or be shown for a very short time to give everyone a quick smile before moving on.

Kevin knows his stuff and is always an engaging speaker. A tour de force, though, should really be allocated an hour, and I suspect this session should have covered more ground in less detail or vice versa.

June 19, 2006

Front Page News

Did you ever hear the one about the man who was too busy to save time? I think we all at some stages in our lives lose sight of what we are struggling to do, when perhaps a few quiet moments of contemplation would allow us to move forward.

For a while now I've wanted to finish the front page news feature so I could get on to newer AJAX-based work. The solution, of course, was staring me in the face when I looked for it: instead of holding up AJAX to finish off a pre-AJAX project I could actually use AJAX to speed up its completion. This strategy has worked well.

The original news feed was based on a programmed search of O'Reilly's Meerkat stream aggregator, but Meerkat went away and anyway the mechanism was cumbersome and error prone. I was looking for something lightweight that the mythical "average web user" would be able to use in a fairly foolproof way. I am happy to say that the first prototype is working exceptionally well, and the front page can now be as up to date as I am!

At the moment the site is still being generated the old-fashioned way, but I predict the retrofit to this parallel site will be easy.

June 15, 2006

Sitting with Nellie

When I started work "sitting with Nellie" (which meant watching an experienced member of staff do a job, and eventually graduating through helping to doing it oneself) was a time-honoured way of learning to perform a task. Back in mediaeval times it could take five years or more to learn a trade by apprecticeship, but you were a fully trained master when you were finished. Apprenticeships fell out of fashion when companies wanted their staff to be more versatile (and hence more interchangeable), and their death was hastened by the rise of "the training industry" in the 1960s, (with an interest in the training revenues that formal training and its inevitable out-sourcing brought with it).

Apprenticeships may no longer be appropriate, but it's also undeniable that sitting with Nellie had certain advantages over formal training. From a personal standpoint the major advantage was the ability to soak up what I might grandly call the prevailing culture and ethos - to learn how the place worked by talking to people who had been around much longer. I well remember having some of my more naiive assumptions and suppositions being questioned hard by cynics whose cynicism was borne of experience (and sometimes occasioning much good-humoured amusement).

Interestingly, Jon Udell has recently observed that the open source world could use similar techniques to educate people about what was involved in developing software, and specifically the techniques and tools used by the open source world. This approach to an extent generalises (because the open source world is very diverse and so the tools of one project may not be appropriate for another), but the idea has a lot of merit. Ted Leung has further observed that children really don't have any effective way of finding out what they'd like to do, and that a "sitting with Nellie" approach could be helpful in this respect too. This rings true with my own experience. Google's Summer of Code project allows open source projects to offer some experience to students for a liimted period, but there's a lot more we could do.

Le plus ça change, le plus c'est la même chose, as they say in France. We need to realise that change isn't always a good thing and, even if a change once were good, it needn't necessarily be permanent. Perhaps when apprenticeships were discarded an important baby was thrown out with the bathwater of restrictive work practices. Let's see if we can't arrange to have would-be programmers spending some time sitting with Nellie.

May 27, 2006

Need for Speed Wrap-Up

Wow! This week has flown by so quickly there's been no chance to blog about what's been going on in Reykjavik. For those of you who aren't aware of the sprint I have already outlined the basics in Do We Need Speed? but there's more to say. Much more.

Thanks for the Memories
This has been a very busy week, and it's not yet been possible to thank everyone for the roles they have played in putting it together. First of all I must thank all the participants for their efforts. Everyone has worked amazingly hard this week. The declared hours of the sprint (along the lines of extreme programming) were 9:00 am to 6:00 pm, but it wasn't unusual to see people busily working away after 11:00 pm.

I have so enjoyed working with this awesome group, and the teamwork has been great. See the participants list to find out who they are, and see pictures on flickr and on jafo's journal (permanent links on the summary page). Favourite picture of the week has to be from the Blue Lagoon (that's me, with my hand out of the water at the back).

Other web references include these blog entries from Richard Jones and occasional remarks from the effbot, as well as Sean Reifchneider's blog. Jack Diederich has also mentioned the sprint in both his general and his geek blogs.

It's been a pleasure, and I am sure that as the result of working together this week we will all be more open to future collaborations. I sincerely hope that we can find other sponsors to support this work as generously as EWT have.

Thanks for the Support
The Python community as a whole has been awesomely behind us in this effort. Particular mention should go to Neal Norwitz, Andrew Kuchling, Marc-Andre Lemburg and Brett Cannon. They have all helped to flesh out ideas, picked up code nits before they became troublesome and cleared up typos in both code and documentation. Not to mention holding us up by pointing out unfortunate misconceptions which would have led to errors if not squashed. Many others have also pitched in, showing the true Python community spirit by reviewing patches and offering advice, and the sprint's success is theirs as well.

We should not forget the magnificent hospitality of CCP Games, a recently-elected sponsor member of the Python Software Foundation. Not only have they entertained us magnificently all week, they have also had three of their staff at the sprint and have provided local knowledge that has made everyone's week more enjoyable. It's never easy being thousands of miles away from home, and this all helped enormously. Special thanks to Kristján V. Jónsson for giving up so much of his time in a week that included a public holiday.

Thanks for the Sponsorship
We should thank EWT, LLC for the most practical support of all. They have funded the air fares and hotel expenses for fourteen sprint members as well as providing the trip to the Blue Lagoon, coffee throughout the week (sprints run on caffiene) and a close-out dinner at the hotel. They have also sent three of their staff from their Beverly Hills head offices to take part in the sprint, even though that meant they missed their Memorial Day weekend. The fourth-floor sprint room is an extremely refreshing contrast with most places I have sprinted before: access to daylight is quite a novelty (though 24-hour daylight took a little getting used to).

EWT's CEO, David Salomon, addressed the sprint in a teleconference on Wednesday, and explained something about his company's ethos and general approach. EWT's support of the sprint, and more generally of Python, is motivated by business considerations, but David's talk made it evident that he sees EWT's role in a larger context. He also announced EWT's intention to make a donation to the Python Software Foundation and to institute a scholar-in-residence program targeted at supporting individuals making open source contributions.

We also each received a Nokia 770 as a gift from EWT. This amazing little device runs Debian Linux, and Richard Jones had his device running pygame by Wednesday morning.

This is also a good place to record my personal thanks to David for hiring Holden Web to undertake the organization of this event. When I made my first hesitant posting to the conferences-discuss list and ended up as chairman of PyCon DC 2003 I little realised where it would lead.

So, What Have We Achieved?
The formal record of our successes is on this wiki page (and a link is best simply because even as I write this entry there are hackers all around me striving to get even more speed into Python, so we are by no means finished yet). Sean Reifschneider reviewed the outstanding patches and came up with a list of potential speedups, which we have been reviewing and chipping away at as a part of our fairly comprehensive task list. The next release of Python isn't due out until August, so there is plenty of chance to build on the work that's been done.

Although not directly speed-related, we have also put quite a lot of work in on the pybench module, as a result of which its accuracy and repeatability should be improved. Not only that but this work may also lead to the introduction of another benchmark with even better repeatability. Lots of tests have been written this week ...

Until all the new tests are incorporated into formal benchmarks it might not be easy to detect the speedup; it's evident to everyone here, though, that many of the speedups are in areas that the average user will perceive. The improvements to string and unicode handling will probably be welcome to most users. I hope readers will feel free to post comments to let the other sprinters know how much their efforts are appreciated.

And Next?
It is the earnest of hope all the sprinters that this event will demonstrate to the software and related industries that they can engage sections of the open source community in a way that assists both sides. Sprinters have been quite willing to address specific performance issues raised by the sponsors, although there was no compulsion on them to do so, and there has been a general appreciation that both sides will benefit from the sprint.

We live in strange times, and the computing industry is having to come to terms with an infrastructure that is increasingly developed and maintained by people not under their direct control. This is scary to the average commercial manager: David Salomon deserves credit for his perception that those who engage the open source community most whole-heartedly will benefit most.

I fly out of Keflavik at 7:20 tomorrow morning. By coincidence I shall be on the same aeroplane as four American friends who change flights in Iceland. So, I have a week's holiday to look forward to, including visits to Loch Ness and a Guiness brewery. Rest and relaxation will be the order of the day. After that it's back to work, and being a director of the Python Software Foundation.

This has been one terrific week, though tiring, and has built a cohesion among the team which will remain even after the sprint. I hope it leads other organizations to consider supporting the open source community instead of just being passive consumers of the output without putting much back. With luck, much may hinge on this little project.

May 21, 2006

Let the Games Begin

I was looking for an erudite Latin tag, but couldn't find one. The Need for Speed sprint starts tonight with a welcome get-together at CCP, the Icelandic company reponsible for the Eve Online game.

I've been monitoring the flight arrivals from my hotel room in Reykjavik, and as I write the flights of the first five arrivals have all been reported as having landed on time. So if all has gone well, and thanks to the assistance of our Icelandic hosts, they should find a driver waiting for them when they clear customs and immigration, getting them to the hotel in around a half an hour.

This is exciting!

April 28, 2006

Tutorials from PyCon

When I finally made time to visit comp.lang.python for the first time in two or three weeks I found a post that said

I would like to know if anybody can point me to the site, where it is possible to find the tutorial "Using Databases in Python" which is mentioned by Steve Holden here:
So of course I had to make sure that the material was available on the web. As a result you con now download my tutorials in either PDF or Open Office Impress format (yes, for once I eschewed using the obvious Microsoft products, and found that the Open Office component was a more-than-acceptable clone).

Using Databases in Python: Impress PDF

An Introduction to wxPython: Impress PDF

These materials are available under a Creative Commons license. Thank Guido van Rossum for inspiring me with the liberal terms of the Python license.

April 24, 2006

Squidoo: Just Another Web Phenomenon?

Seth Godin, a marketing whizz if ever there was one, recently started promoting a new web service called Squidoo that's aimed at making web authorship easier. The idea is to buld "lenses", which are promoted as offering ways to look at the world.I just noticed there are graphics you can use to link to your lens.
So take a look at Check out my lens Python is Amazing. And if you find something amazing I haven't included please feel free to let me know!

April 20, 2006

Shameless Self-Promotion

An Amazon review of Python Web Pogramming reminds me that the book wasn't written just to explain Python ... thanks, Sheila!

20 of 21 people found the following review helpful:

Excellent example snippets; Clear explanations, February 24, 2002
Reviewer:Sheila King "desk-worker mom who needs to exercise" (L.A. County, California) - See all my reviews
If you are going to be using Internet protocols, doing network programming, or web programming with Python, and these are new topics for you, I would highly recommend this book.

The book starts with a brief overview of the Python language. The author's intention is that someone with a fairly extensive programming background in other languages would be able to pick up enough Python from this overview to be able to do the rest of the programming in the book. Perhaps so. I already know Python, but did find the summary in the front informative.

I really like the fact that nearly every page has a code snippet on it. Examples are brief and to the point. The author explains each line of code and has a very direct and clear way of explaining things. I found the explanations easy to read and understand.

After the brief Python Language overview, comes an overview of sockets and socket programming. I've been trying to learn a bit about the whole topic of sockets by searching the web and nothing I found on the web explained it as clearly as this book. I now appreciate the difference between TCP and UDP protocols and have an idea of the situations in which I would want to use each. If you want to learn low-level sockets, or how to write your own socket protocols, this is not the book you are looking for. This book basically assumes you will go with either TCP or UDP (and ignores the other types of sockets available in the Python socket library). However, these will probably suit most people's needs.

The author then walks you through each of the Internet data-handling libraries in Python, such as the telnetlib, ftplib, poplib, smtplib and so on. He gives examples of working code for each library, showing first how to implement clients, and later on how to implement servers. If you want to work with these libraries, these explanations should be very helpful.

Later in the book, Holden addresses using databases in Internet programming, using XML and writing your own web-application framework. I haven't yet had a chance to go through these chapters in detail (I've skimmed them only). But there is a LOT of stuff there. One thing the author does at the beginning of each new section, is give an overview of the topic (such as an overview of why you might want to use a database, how databases work, or why you might want to work with a web framework). For me, I really appreciate this type of overview. It helps give me a context for the new information, and helps me to make better sense of it. I read through some of the database chapters where he explains how the SQL query language works, and again, I have to say it is one of the best explanations I've read. (Most explanations I've read about SQL have just convinced me I wanted to steer clear of it.)

Another nice thing, is how he sort of "works you up to" SQL. He starts out with regular Python code, and shows how parts of it are similar to working with an SQL database, and then eventually transitions into the full SQL language. He also addresses database design and efficiency.

Overall, I'd say if you want a good overview of the topics mentioned here, want to understand the reasoning behind their use, and want to be able to understand good design and efficiency, then this book should really help you out.

April 16, 2006

End of Internet Prematurely Flagged

In Tech Blogosphere has peaked [the blogosphere deserves a capital?] Phil Sim suggests (I paraphrase) that
"anybody who's going to have a blog has one by now, after two years as a journalist you get stale, lots of bloggers are going back to real life"
and graphs the "daily reach" of, thereby confirming that for him it's all about the eyeballs. When I look back at my own blogging history I see that there are frequently months when I have written nothing at all in my blog, and hey, here I am still blogging.

If blogging is just "look at me, Ma!" then the sooner it stops the better. Blogging works best not when it's a publicity channel but when it's a reasonably consistent, selective window onto the world of the blogger. Like journalism, much blog content is ephemeral, and the blogging world needs to remember that. If it's not on page one of my then it's history. If you want the history it's sometimes there, but remember it could be a revisionist history as blog posts can be changed at any time.

April 15, 2006

Test-Driven Development

Some code, for a change. I recently taught an introductory Python class to some fairly experienced programmers, and we had an hour or so left at the end of the class to try a problem. We'd been discussing test-driven development, so we arrived at the idea of creating a problem that was fairly simple in scope and then writing tests and a solution to the problem.

The idea was to write the tests first, though it will be no surprise to those who've done this before that the nature of the tests changed as errors came to light during development. The problem was as follows:
Given a directory structure of arbitrary shape, locate all JPEG images and copy them into a named destination directory. [Not specified but implied: the files should continue to exist in their original positions]
It turned out that the level was fairly well chosen. None of the students managed to complete the task, but they all had a fairly clear sense of where they were going by the end of the exercise, giving them something to work on independently after I'd gone. Of course I had to provide them with a "model solution", which I'm happy to say I just managed to create in the time allotted.

Here is the test harness for the jpegcopy requirement:
import jpegcopy
import unittest
import os

BASEDIR = '/c/Steve/Projects/BrightonHove'
BASEDIR = 'c:/Steve/Projects/BrightonHove'
INDIR = os.path.join(BASEDIR, "input")
OUTDIR1 = os.path.join(BASEDIR, "output1")
OUTDIR2 = os.path.join(BASEDIR, "output2")
EXPECTED = ['%s.jpg' % s for s in "f1 f2 f3 f4 f5 f6".split()]

class TestJpegCopy(unittest.TestCase):

def setUp(self):
"""Ensure both output directories are empty."""
for d in OUTDIR1, OUTDIR2:
fl = os.listdir(d)
if fl:
for f in fl:
os.unlink(os.path.join(d, f))
raise ValueError, "Cannot empty directory %s" % d

def testEmpty(self):
n0 = jpegcopy.main(OUTDIR1, OUTDIR1)
self.assertEquals(n0, 0)

def testDir1(self):
n1 = jpegcopy.main(INDIR, OUTDIR1)
self.assertEquals(n1, 6)
self.assertEquals(os.listdir(OUTDIR1), EXPECTED)

def testDir2(self):
n2 = jpegcopy.main(INDIR, OUTDIR2)
self.assertEquals(n2, 6)
self.assertEquals(os.listdir(OUTDIR2), EXPECTED)

def tearDown(self):
for d in OUTDIR1, OUTDIR2:
for f in os.listdir(d):
os.unlink(os.path.join(d, f))

if __name__ == "__main__":
Nothing too fancy here. The tests are parameterised. We set them up by clearing both the output directories. Then we test that they are indeed empty. Then we test to make sure that we can put the JPEGs into two different directories and verifying that each time we see six files copied. Finally we check that both output directories contain the same thing. We tear down the test by deleting the contents of both directories.

This will probably show my ignorance, highlighting the fact that test-driven methods don't yet come naturally to me. I'll be happy to integrate suggestions for improving test coverage. My solution follows.
"""Copy jpegs from a recursive to a flat directory structure."""

import os
import shutil

def main(indir, outdir, debug=0):
count = 0
for f in os.listdir(indir):
if os.path.isdir(os.path.join(indir, f)):
count += main(os.path.join(indir, f), outdir, debug=debug)
if f.endswith(".jpg"):
count += 1
shutil.copyfile(os.path.join(indir, f),
os.path.join(outdir, f))
if debug:
print "Returning %d for %s" % (count, indir)
return count

if __name__ == "__main__":
main("input", "output1", debug=1)
As you can see I have put a simple test inline; this script is not intended to be run as a main program, but the debug output was useful sometimes when tests failed for obscure reasons.

Again, if readers can suggest improvements I'll incorporate them as I have time. You should be able to copy the code from your browser window and paste it into an editor. Thanks to the MoinMoin developers for

April 6, 2006

Do We Need Speed?

Well, finally the news is out. The reason things have been so quiet on this blog (and why I've been missing from comp.lang.python pretty much since PyCon) is that all available spare time has been going into trying to get the Need for Speed sprint off the ground. To my knowledge this event is unique, though an assertion like that invites correction from the better-informed.

Commercial organisations are free to use the output from open source projects without putting anything back (vide Industrial Light and Miserliness), and by and large this is expected -- open source licensing terms are fairly explicit, and few are even as "draconian" as the GPL (which currently says in essence that if you distribute GPL-derived products you have to make the source available). By the same token, of course, there's nothing to stop people from supporting open source projects if they want to.

Hopefully this little gap in the curtain will help to trigger a realisation among commercial software developers that they can make a huge difference to open source projects from which they might benefit (or from which they already have benefited) by the application of what are, in strictly commercial terms, relatively modest funds. The various Foundations controlling some of the better-known open source projects do make their own efforts, but without significant external funding (such as that achieved by the PyPy project) and management expertise (extremely variable) it can be difficult to make progress. Heck, even management training might be a useful contribution from the world of pointy hair (I'm sorry, I'll wash my mouth out with soap later).

Of course we'll have to wait and see. It could be that I am completely wrong, that hardly any of the invited developers will bother to come, and that the sprint will really show that open source people don't want to play in the commercial playground at all. In which case I guess I'll have egg on my face.

If I'm right, though, it will show that there are benefits to be had on both sides of the equation and that, although open source developers don't do it primarily for the money, they don't necessarily object to working with commercial developers when there is a sufficient alignment of goals. I feel incredibly fortunate to have been given the chance to put this belief to the test, and I can't wait to see how this effort goes.

March 25, 2006

Settle Up with Sony

The Electronic freedom Foundation is trying to ensure that Sony Corporation provides reparations to anyone who inadvertently installed the rootkit software on their computer.

EFF is a body well worth knowing about and supporting. They prosecuted a class action suit against Sony BMG, and if you want to get in on the settlement now is the time.

Find out how to get your share.

March 11, 2006

Sun on the Run?

Talking, as I have been recently, about marketing, it was interesting to see a report of Sun's James Gosling apparently going negative at a recent education and research conference. In politics, and by extension in marketing, typically the frontrunner starts going negative when they perceive their lead is threatened. It's also interesting to see him breaking a cardinal rule and mentioning "the opposition" by name -- he's a techie at heart. Even though Python is mentioned only once in the article I linked to, we should be grateful for the free publicity.

Another significant feature of the report is that although Python is mentioned with a number of other languages, it's the other languages that are then dissed, leaving Python to suffer by association. Putting all this in context, though, Gosling's remarks were actually made in response to a question from an audience member.

The link I pointed you to makes it looks like Gosling's whole purpose was to defend his products against competing languages, when in fact that was far from reality. So another lesson is "don't believe everything you read in the papers (or on the web, for that matter)". The reporters have their own agenda, which is to spice things up and make them look like they're worth reading. They want your eyeball-seconds!

Just the same, I think it's revealing how Gosling chose to respond to the question. He's obviously feeing some sort of pressure from Python and its kind. I've given my own opinions on his language before, so I'll content myself with reiterating that this defensive response, coupled with a complete absence of objective criticism, can only be good news for Python.

March 8, 2006

A Little Marketing Effort

To follow up my last effort, I've created an evangelistic Squidoo lens. Comments and suggestions for improvement are welcome.

March 5, 2006

Marketing? Why Do You Use Python?

Well, the cat's among the pigeons and no mistake. Guido van Rossum recently received an email from John Sirbu that was basically a plea for more Python evangelism, to counteract the percieved successes of Ruby on Rails (RoR). Guido chose to republish this on his Artima blog under the title Marketing Python - An Idea Whose Time Has Come, although the word "marketing" doesn't appear in the original email.

Now, I happen to believe in this case that Guido is right, and that Python would indeed benefit from some serious marketing activity. Unfortunately it appears that the storm of responses (the count was almost at 70 last time I looked) has been generated mostly by people who have little or no idea what marketing is, and if they did would probably hold up their hands in horror and run away screaming "shameless commerce".

Some respondents appeared to find the whole idea of marketing offensive, and in so far as that's what led to Java's current popularity I can agree, because I happen to feel that Java has cost the industry a lot of money with its unnecessary straightjacket (see Java is Object-Oriented COBOL). But most people who took up the cudgels by commenting on Guido's blog (I deliberately avoided doing so myself) simply roll out the many reasons why Python is technically superior to Ruby/Perl/Java/my favourite language. These people are even more clueless than me about what drives the adoption of a programming language.

The key insights into promotion by marketing come from the realisation that you don't sell people things by focusing on the features of your product -- you have to explain the benefits instead. Sell the sizzle, as they say, and not the sausage. Anyone who is seriously interested in promoting Python should view Seth Godin's video presentation in the "Google Author Series". Seth is a marketing professional with a long history in the high-tech world, and his blog is worth keeping an eye on for occasional flashes relevant to the technical world. A valuable insight from the presentation is that word-of-mouth is the most effective way to promote anything: what we need is to tell everyone who asks (and even people who don't) how effectively Python can meet their needs. Clients aren't interested in whether I'm using Python, as long as I can help them solve their problems cost-effectively and in acceptable time.

This puts me in mind of a recent thread on comp.lang.python which started out as a discussion on whether it was a good idea to talk about Python as an interpreted language. Before it devolved into the inevitable discussion about what exactly constituted an interpreter and how all languages were interpreted because the CPU interprets instructions ( is sometimes an object lesson on topic diversion), there was a huge discussion about whether using the word interpreted to describe the language would engender negative perceptions.

From a marketing point of view, it really doesn't matter whether Python is an interpreted language or not. People want to know whether it's an effective way to solve their problems, so while interpreted is insignificant to meaningless from the point of view of many adopters, widely portable might not be. Of course the feature (interpreted) and the benefit (widely portable) are opposite sides of the same coin, but the difference in emphasis is crucial from a promotional point of view. Ultimately a potential adopter wants to know "What's in it for me?"

From a marketing point of view it seems to me that some of the most important aspects of the Python language are as follows:
  • Easy to learn
  • Easy to apply to a wide variety of problems
  • Installed on every significant operating system
  • Great networking support
  • Large applications base to draw from
  • Vibrant, helpful community
  • Integrates easily with other languages
  • Excellent literature from a range of publishers
Note that here I haven't mentioned any language features at all, but the points should be interesting to anyone considering adopting Python because they speak to the user's needs. There are doubtless other benefits, that I hope readers can help me add to the list. Clearly all these things have to be true to be effective in marketing the language, and also there musn't be equally important disbenefits that haven't been mentioned - if Python enthusiasts tried to encourage adoption by lying about the language, even by omission, that would quickly become counter-productive.

One current problem seems to be that a lot of Python enthusiasts in the web world are concerned that another language (another rule of thumb: don't mention the competition by name) is getting more than its fair share of the buzz. They are trying to counter that buzz by focusing on Python's technical superiorities, without realising that the adopters of the "competition" aren't bothered about technical issues. They have found a solution for a range of problems, and they are adopting it to solve those problems.

It may be that as they try to extend those solutions they will come to realise that their adopted technology doesn't have the depth to extend in all the ways they want it. At that point the Python needs to be ready to reiterate the benefits of Python, and to show how it can be used to extend their existing solutions rather than forcing them to reqrite from scratch. Since the web world is already well-used to mixed-language and mixed-technology solutions this should be a breeze.

Ultimately the point of this blog is to try and help Python users to become more effectve advocates for their favourite language. We have a vested interest in seeing Python more widely used, so let's forget the features when we're discussing "Why Python" and focus on the benefits.

March 4, 2006

Snakes on the Web [Django How-to]

The third speaker for my session as chair was Jacob Kaplan-Moss, one of the two primary developers of Django. He took as his example a sudoku solver, explaining that his company had been paying $180 per week for a third-party service to provide sudoku puzzles, so there was a potential saving here of $8,000 per annum.

The Django application stack is Model/URL/View/Template. Jacob explained he was going to walk us through the whole stack in developing the application. Many people are surprised to see the URL in there, but hey, that's how the app interfaces with the real world.

All models inherit from django.core.meta.model, and you can easily give your application an online production-ready admin interface by creating a META class with an admin class attribute. The model describes much more than just the database structure. A class can have a validator list, in which case all validators must return true.

The applications used Eppstein's well-known PADS algorithm to generate sudoku puzzles. This creates a puzzle and saves it in a database. The database layer is explicitly called, nothing ever happens without a specific API call, as opposed to other object-relational mappers that perform database I/O autonomically.

URL patterns are established to determine which URL calls what piece of view functionality. When a request is submitted get_validation_errors() will return a dict of errors. Callable returns render_to_response(template, context).

The Django templating language is relatively simple, and this was a deliberate design choice. It was primarly so that technically unsophisticated users in the newspaper production environment would find it straightforward and intuitive.

Jacob continued the development of his example by adding a further model for solving puzzles, walking through the solution provided by the PADS algorithm.

Django applications area pretty much pluggable, so if you have Django running all you need to do is download the code, add it to sys.path, add sudoku to INSTALLED_APPS, include the URLs for sudoku somewhere in the mapping (an inclusion mechanism allows the inclusion of a bunch of URLs with a single pattern match, and ths feature extends to multiple levels). Finally use django-admin to install sudoku.

Tadaa! Once again an effective demonstration of a Python-based web technology that people should be adopting in droves.

Effective AJAX with TurboGears

Kevin Dangoor is the leader of the TurboGears project. This was the second in the session that I chaired. I was lucky in drawing excellent speakers for my session, and I was very glad I got the chance to hear this talk.

Do I need Ajax? Kevin felt the answer to this questions is "probably yes"! The web is becoming hugely interactive, and people are getting used to sites being much less passive than they have been. Gmail is two years old, and it has forced the pace. Nowadays 95% of the world uses one of three browser platforms, and toolkits are available to smooth over the differences (the two specifically mentioned were Dojo and mochikit).

Rather than present the dry theory behind AJAX Kevin then focused on the question of interest to many web developers: "How can I use it"? He gave several interesting examples based on TurboGears, but it would be easy to adapt them to other environments.

Data entry: this hasn't traditionally been a friendly task on the web. To choose between thousands of items a pick list is not appropriate: real-time searching is helpful here (the example selected from thousands of items by restricting those displayed in a pulldown to partial matches). People are used to auto-complete fields, so they quickly adapt to this behavior in web applications.

Realtime updates: AJAX can continually poll the server for updated information. This can capture simultaneous edits to the same data (the example informed an editor in real-time that someone else had changed the data they were currently working on).

Ordered lists: AJAX can be used to detect drops in drag-and-drop operations and reorder the data directly in the browser window (this time a very nice drag-and-drop example was presented).

There are also things we can do that confuse the users: people like the way their browsers work, and expect other applications to work the same way. Ajax reduces the value of the browser "back" button, and Dojo has provided a solution, but another solution is to not use AJAX! (I'm always encouraged when a speaker doesn't try to present their topic as a universal panacea). Users want to be able to bookmark the current state of their web application, and again Dojo can help to solve this problem by changing the location bar along the way. It's also possible to make controls behave in ways people don't expect: using radio buttons to submit information is generally a bad idea (examples of each faulty behavior were presented, as warnings of what not to do).

Summary: know your users - with a small user base you can train people if it helps improve their productivity, but this won't work with a large user base. Don't break what people expect to work, as they have invested a lot of time in learning the expected and customary interface behaviors.

With Guido's recent plea for more effective Python marketing I think this talk was a poster-child for what might be done. It focused on practical needs and gave clearly comprehensible examples, with the underlying code to diminish fears of complexity. If we can show people Python being used in this way they will flock to use it.

Everything About Web Programming (except Programming)

This session was the first of three where I did my PyCon duty and chaired a session. I was glad Ian Bicking's talk was included as I'd been unable to record his earlier session. Ian suggested that the focus of his talk was on "accepting my inner sysdadmin".

Imaginary Landscape is a small company with, like many small companies, vague division of responsibilities. Ian wears two hats: the programmer (who likes to write new things) and the sysadmin (who would rather redeploy well-understood software). Conway's Law applies: organizations that design systems are constrained to produce designs which are copies of the communication structures of these organizations.

The company got where they are via Zope 2 development - which "felt a lot like PHP", with no overall process. So they moved into Webware and Subversion - but again without an overall plan. Things moved into production without a definite transition.

Deployment was stressful, so they decided to stop deploying! Applications were "multi-customer", and rather than developing good tools they would build complex applications. Ian now realises this was dumb, though it seemed smarter at the time. Unfortunately it meant that since everyone got the same software customization was infeasible, and configuration data had to go into the database, with no SVN control. Deployment should be easier than that, and simple deployment obviates the need for multi-customer applications: each customer can receive their own code.

Paste is a toolset that takes advantage of standard Subversion layout, providing a skeletal file and database model and a basic framework of templates and internally-used metadata. Testing also starts with a basic model created by the tool. Functional tests are really important: even just knowing that you can access the root page of an application is a worthwhile test.

Sometimes it can be problematic developing web applications in a test-driven environment. Developing code without opening it in a browser sounds cool, but it turns out that it's too easy to overlook gaping holes in an application (like there's no link to a page that the test framework has been accessing directly). This level of purity turns out to be too extreme in practice.

Configuration data is essential to a project, and it's important to differentiate between program configuration as usedby programmers and application configuration as used by system administrators. The tool creates a template for the deployment configuratiom file. Client data is kept in separate repositories, not int he application, and controlled separately from the application code.

Deployment uses the buildutils, and installation is moving towards a two-stage process. The first step installs the configuration file and the second step sets the application up in line with that configuration. Setuptools has the ability to install multiple versions of the same product, but this turned out not to be useful, as the separation between the versions wasn't sufficiently clean.

Ian now feels that "the computer" is a very poor context for installation of anything at all. ("Site-packages considered harmful"). So each site gets its own Python environment in development, and they use sitecustomize code to configure aaccording to the ACTIVE_SITE environment variable.

A nicely-focused talk with some valuable lessons for us all.

RIP Meerkat

Well, I just discovered that the end of an era is upon us.

For a long time now the site has had news links extracted automatically from O'Reilly's meerkat system. Because the site generation code is still under test I only periodically click a button in its interface that says "Update News", and today for the first time I got a traceback from xmlrpclib complaining about 302 server status.

The redirect (which should really be permanent) is to a page that says
Update: Meerkat was shut down March 2, 2006
No reasons, just that bald statement. Since meerkat was an early exemplar in the web services field I thought it would be appropriate to mark its passing.

Now I have to get serious about a replacement. I have for some time been tweaking a feed extractor using Mark Pilgrim's excellent FeedParser module, so that's now moving up the priority list.

The underlying database is still Access, and so this morning I used DBManager Professional to convert it to PostgreSQL. Still needs changes, but I'm happy to say that at least I managed to generate a local copy of the site with no errors.

February 28, 2006

Kudos to Kuchling

While I am on the subject of conference organization, I don't believe I have recorded what a terrific job Andrew Kuchling has done in putting PyCon 2006 together. I know from personal experience that the organization of this conference places huge stresses on the chairman, but Andrew has performed magnificently.

This PyCon has been better in so many respects than the three that preceded it. I also know from the chatter I've seen in the background during the preceding year that things are becoming rather better codified to assist future organizers. PyCon will continue to improve.

Well done, Andrew!


Further to my original remarks about the network at PyCon, this has probably been the biggest complaint I've heard people make, and the post-mortem continues as wireless associations continue to fade in and out during the sprints. Considering that

a) The networking company is at least part-owned by Marriott, and
b) A five-figure sum was negotiated for the wireless coverage and Internet connectivity

the results were very disappointing. For the record STSN didn't have anyone on-site, and there was no evidence that they had sent anyone over beforehand to do any testing and verify that the network coverage would be robust when descended upon by a crowd of geeks, most of whom were expecting the same kind of coverage we got at GWU.

I've no wish to prejudice any negotiations here, so I'll content myself with saying that during the sprint startups, when we shared a room with the organizers' debriefing, I was reminded how difficult it is to get everything right when you depend on third parties for so much of it. This issue isn't over yet, and meetings will follow to pick over the carcass.