August 29, 2010

Preparedness, Privilege and Discrimination

Coming very late to the party I noticed a blog entry from June about the JavaScript community's response to Google's financial support to allow more women to attend JSConf Europe. It sounds like it was indeed the usual real can of worms, and many of the comments show the usual lack of appreciation of the problems from the privileged side of the issue (in this case men, since it is a gender issue and women appear to be about as well represented in the JavaScript community as they are in the Python world, which is to say hardly at all).

The reason I found it interesting was that this year, for the first time, PyCon also used Google's support to fund the attendance for more women, who came from as far afield as Roumania and India. I was gratified to be told by several of those who would otherwise not have been at the conference how happy they were to have had the chance to attend. (In truth I had little to do with it: Google should get the credit for the funding, and the actual hard work was done by Peter Kropf and Gloria Willedsen). In the Python community's case there was one short period of adverse comment on the #python IRC channel, which resulted in my exchanging emails with one person to explain why I thought it was a good idea to encourage more women to be at PyCon. End of story, except that in 2010 women represented 11% of PyCon attendance, up from 2% the previous year. I count that as one of the better results of my time as chairman.

Now I am not saying this to be smug, but because I believe there was a reason for the difference in the reactions. Last year the PSF, at Guido van Rossum's urging, started a diversity mailing list which discussed the questions of race, gender and other discrimination extensively and sometimes acrimoniously. Eventually this led to a proposal for a "diversity statement", which was referred to the membership where it triggered another round of extensive and sometimes acrimonious discussions, leading to a referral back to the diversity list and a further proposal which was finally accepted by the membership more or less unchanged and adopted by the Board:
The Python Software Foundation and the global Python community welcome and encourage participation by everyone. Our community is based on mutual respect, tolerance, and encouragement, and we are working to help each other live up to these principles. We want our community to be more diverse: whoever you are, and whatever your background, we welcome you.

This may not be the best statement ever, but if anyone bothers to look it does make it clear that these issues have been addressed. Thus anyone who feels discriminated against can decide that at least they would have a chance of a fair hearing should they choose to complain (which, sadly, I imagine most don't, instead choosing to vote with their feet). Similarly, anyone about to indulge in discriminatory behavior might think twice before doing so.

The hidden benefit of this long-drawn-out process was the creation, on the diversity list, of a corpus of varied individuals who had discussed these  issues and hammered out a shared approach to the problems that included a refusal to punish individuals for things done out of ignorance. It also meant that when one speaker used a slightly ill-advised graphic in a presentation the issue was dealt with then and there in a very direct manner without any recriminations, and I didn't even get to hear about it until  much later that day. The speaker was advised that the material was inappropriate and that therefore the slides and the video of the talk wouldn't be published, and hopefully left without feeling that they weren't welcome at next year's conference.

I hope that the JavaScript community manages to develop its own understanding of diversity issues and its own process for dealing with them. I know it took up a lot of my time as PSF chairman and gave me some uncomfortable moments (and does not exempt me from the results of my own stupidity in the future), but I am glad it led to a tolerant community process that nevertheless has made it clear that discrimination is not acceptable.

August 28, 2010

The Hiring Smiley Curve

I recently commented on how hi-tech companies seem to advertise for "rock star" developers. This appears to me to be a little shortsighted. Maybe Google have enough money to do it, but the average company looking for a singer wouldn't be able to go out and hire Mick Jagger or Van Morrison (yes, I know, I am showing my age - it will become obvious this isn't an accident). So why do they think they can afford the software equivalent?

As I mature (that sounds so much less pejorative than "grow older", doesn't it?) I have realized that my decision to eschew the corporate world and plough my own furrow wasn't entirely disastrous. I have been self-employed, or an employee of a company which I owned, for over twenty years now. The last large company I worked for was Sun Microsystems, and I left in 1988 after realizing that I was a misfit for the corporate environment. Since then I can honestly say I have never worked for a more charming or delightful person. Nowadays my boss never hesitates to take my interests into account.

Just occasionally, I have toyed with the idea of third-party employment, most recently with Google (twice). The first time I explained before going for interview that I was not prepared to relocate to the West coast . At the time I was moving back from the UK to the USA, and had just bought a house in Virginia, so I was a little surprised to learn three weeks after my interview that "we aren't prepared to make a remote hire, but would consider you for Mountain View". Large company mentality: ignore what the potential recruit wants, and try to hire them anyway. No sale. About a month ago they called me to ask if I would consider employment in the DC area, but it turned out they were only hiring for Java projects. I used to write Java but I'm all right now, so the recruiter and I agreed to part company amicably after ten minutes on the 'phone.

None of this amounts to a hill of beans but I was encouraged about a year ago, when I was talking to someone in New York City about experience levels and consulting opportunities, to learn about the "smiley curve" theory of work rewards. In brief, this theory says that you hire young raw recruits because they are full of energy and don't cost much. As people become more experienced they expect more money but their productivity doesn't go up commensurately, so they are less desirable but you need them because there aren't enough young turks to do all the work. Then, as you mature, your desirability goes up again because (direct quote, as far as I can remember) "you have seen everything and you know everything".


If true, this is quite encouraging. There must be lots of companies looking for someone as experienced as me. The only question now is whether they can afford me!

August 27, 2010

Apple Going Over the Top?

Yet again I rejoice that I am not an iPhone user. Indeed, given this latest news I might well just chuck  my Mac mini away in protest (no, you probably don't want it - it's an aging obsolete PPC mini from about six years ago).

In news from the Electronic Frontier Foundation I learned today that Apple has applied for patents on method of spying intrusively on the users of their devices. Here's a partial list of possible applications:
  • The system can take a picture of the user's face, "without a flash, any noise, or any indication that a picture is being taken to prevent the current user from knowing he is being photographed";
  • The system can record the user's voice, whether or not a phone call is even being made;
  • The system can determine the user's unique individual heartbeat "signature";
  • To determine if the device has been hacked, the device can watch for "a sudden increase in memory usage of the electronic device";
  • The user's "Internet activity can be monitored or any communication packets that are served to the electronic device can be recorded"; and
  • The device can take a photograph of the surrounding location to determine where it is being used.
So enjoy your iPhones. I won;t be doing business with a company that thinks this way. Sigh. I suppose that means the new laptop will have to be PC based.

August 22, 2010

Windows Vista Mystery Shares

For reasons best known to Microsoft, when I try to delete a folder which has been shared (through the Explorer interface) it takes forever to complete. This would not be so bad if there were just one or two shares, but sadly (for reasons best known to Microsoft) a large number of folders randomly appear to have become shares (see the screen dump of a portion of my home directory at the right). I have no idea how these folders became shared. It certainly wasn't any intentional act of mine, and heaven alone knows what this does to performance.

Now, you are probably wondering why I don't just switch off sharing before I delete the directory. The answer to that is that although Windows is displaying the folders as shared, it doesn't really seem to believe that they are shared. So there doesn't appear to be an easy way to switch this sharing off.

If I had some idea how it had been switched on in the first place that might help, but as with so many other aspects of Windows performance this remains a mystery. If someone cold offer some insight I'd be happy to find out what's going on here.

August 16, 2010

Schmidt Foot-In-Mouth Attack Continues

Wow, two consecutive Eric Schmidt posts. But this really is quite newsworthy. An interview with Wall Street Journal published on Saturday claims
He predicts, apparently seriously, that every young person one day will be entitled automatically to change his or her name on reaching adulthood in order to disown youthful hijinks stored on their friends' social media sites. 

This really is the most patent hogwash. Surely someone like Schmidt, with a brain the size of a planet, could foresee that if such name changing were to become commonplace it would inevitably lead to the creation of services that mapped between past and present-day identities? Given the ability to identify images recently demonstrated by sites like tineye.com it's only a matter of time before changing your name will no longer be a way to erase the records of your misdeeds.

This makes it all the more important that privacy and basic information security become high school subjects, but alas the corporate overlords that have the ear of government in most developed (and many less-developed) countries will be attempting to make sure that isn't a priority, because it isn't in their interests.

August 12, 2010

Eric Schmidt Looks Forward to Big Brother

"You only need to give up
just a little bit of freedom"
Well, Google's senior management are really keeping us guessing this month. I am not really sure any longer whether the company actually has any coherent point of view on individual privacy. Nowadays it seems you can't even expect consistency. In a startling repetition of his prior assertions that not only is Internet anonymity dead but that he wants to dance on its grave. According to a Read Write Web report Eric Schmidt, CEO of Google, included the following comments in his remarks to the Techonomy conference:
"The only way to manage this is true transparency and no anonymity. In a world of asynchronous threats, it is too dangerous for there not to be some way to identify you. We need a [verified] name service for people. Governments will demand it."

Any government that requires individuals to give up their rights to anonymity is a government past its sell-by date. To my mind this reveals further insight into Schmidt's "what's good for business is good for the people"  view exemplified by the much-discussed recent joint statement with Verizon on network neutrality. It's fairly obvious that Schmidt sees government's role as paving the way for corporations to increase their profits, not preserving the freedoms of the citizenry who elected it. This is so far from "government of the people, for the people and by the people" that it's apparently time "do no evil" was replaced by "make more money".

"But that would hurt Google's
stock price!"
Yet it was only in May (yes, three months ago) that Schmidt was publicly suggesting that as far as Google was concerned "privacy is paramount". Does he really know what the priorities are any longer? It seems like we can dismiss any further utterances as the self-serving flip-floppery of the 129th richest man in the world. What does he really think? Apparently it depends on which way the financial wind is blowing. The really depressing thing is it's transparently obvious that here we have a man who will do well in politics. "The Best Democracy Money Can Buy" indeed.

August 6, 2010

Tests That Test Your Tests

I just had an interesting experience with Steve Miller, the technical editor of the Python classes I am writing for O'Reilly School of Technology. We are just getting to the end of the second of four courses, and I like to think that we are moving right along: in the final lesson students write a simple GUI-based program that searches for and displays e-mail messages stored in a MySQL database.

I introduced test-driven development in the second course. Not only does this encourage good habits in the students, it also makes it somewhat easier to test some of their exercises (though I still do not have a good approach to testing Tkinter-based GUI applications). In time-honored fashion we start with tests and a program full of stubs, and then expand the stubs to pass the tests. For the email database the initial API is very simple: there is one function to store messages and two others to retrieve them, by primary key and Message-Id.

In the final chapter I start out with a very simple database table that stores the body of the message as a LONGTEXT column. The only other columns in the table (to start with—it gets more complex later) are the automatically-generated primary key and the Message-Id Header. The tests use a setUp() method that completely re-creates the database table and populates it from messages stored in a bunch of files:

 FILESPEC = "V:/Python2/Lesson12/MailData/*.eml"  
 class testRealEmail_traffic(unittest.TestCase):  
   def setUp(self):  
     """  
     Reads an arbitrary number of mail messages and  
     stores them in a brand new messages table.  
     DANGER: Any existing message table WILL be lost.  
     """  
     curs.execute("DROP TABLE IF EXISTS message")  
     conn.commit()  
     curs.execute(TBLDEF)  
     conn.commit()  
     files = glob(FILESPEC)  
     self.msgids = {} # Keyed by message_id  
     self.message_ids = {} # keyed by id  
     for f in files:  
       ff = open(f)  
       text = ff.read()  
       msg = message_from_string(text)  
       id = self.msgids[msg['message-id']] = maildb.store(msg)  
       self.message_ids[id] = msg['message-id']  

There were two relatively simple tests, one to test each of the retrieval functions in a fairly simplistic way by verifying the correspondence between the primary key values and the Message-Id headers using the msgids and message_ids dicts created during the set-up. The initial stub under test only implemented the store() function, so these tests were initially expected to fail:

   def test_message_ids(self):  
     """  
     Verify that items retrieved by id have the correct Message-ID.  
     """   
     for message_id in self.msgids.keys():  
       pk, msg = maildb.msg_by_id(self.msgids[message_id])   
       self.assertEqual(msg['message-id'], message_id)  
   def test_ids(self):  
     """  
     Verify that items retrieved by message_id have the correct Message-ID.  
     """  
     for id in self.message_ids.keys():  
       pk, msg = maildb.msg_by_message_id(self.message_ids[id])  
       self.assertEqual(msg['message-id'], self.message_ids[id])  

Steve and I had both run this code under much the same conditions as the students would, and verified that the tests did indeed fail due to the AttributeError exceptions raised by the missing msg_by_id() and msg_by_message_id() functions. Later steps have the student implement these functions, which makes the tests pass.

We were somewhat surprised to find when Steve ran his final checks that the tests were now passing, even though he had reverted to the original module with no implementations of the message retrieval functions! It took me the best part of an hour chatting with Steve to eliminate everything I could think of that might be wrong: no .pyc files left lying around, no odd path settings that allowed import from other copies of the code, and so on.

We finally tracked the issue down to something stupidly simple, as is often the case with bugs that have you scratching your head for an extended period: for production purposes the data files had been moved to an area where all students could share them (each student has their own V: directory). This meant that the setUp() method was not inserting any rows into the newly-created message table. The empty table in turn meant that neither test_message_ids() nor test_ids() was running the the body of the for loop, and consequently no AttributeError exceptions were being raised. The tests were passing even though the functions they were supposed to test had not been implemented!

My solution to this was to add a further test to verify that the table was not empty. That way, even if the other tests passed, this one would fail:

   def test_not_empty(self):  
     """  
     Verify that the setUp method actually created some messages.  
     If it finds no files there will be no messages in the table,  
     the loop bodies in the other tests will never run, and potential  
     errors will never be discovered.  
     """  
     curs.execute("SELECT COUNT(*) FROM message")  
     messagect = curs.fetchone()[0]  
     self.assertGreater(messagect, 0, "Database message table is empty")  

The check could have been added to one of the other tests but it seemed to make more sense to keep it separate, since several tests relied on the table having been populated.

In this case the error condition was that the test were passing! I am happy about this because it clearly demonstrates the value of test-driven development, even though the result I was getting was normally the desired goal of testing. It has also taught me to be more careful about tests in loops: if there is no guarantee that the loop body will execute then the tests inside it can be completely useless.

August 5, 2010

DjangoCon US is International

As registrations for DjangoCon US grow it's been interesting to see where people are coming from. I originally thought that it would be US-only. While US delegates dominate the lists as you might expect, we have people coming from all over the world. Here's a graphic showing the distribution of delegates across the globe.
I think it's a measure of Django's excellence that DjangoCon attracts people from so far away. I well remember when Jacob Kaplan-Moss and Adrian Holovaty first came to PyCon in Washington DC to describe the system they were putting together. At that stage Django wasn't open source, and the encouragement of the enthusiastic PyCon audience was a major factor in its becoming so. The software has come a huge distance in a relatively short time, and is now a major Python success story.

There are still places left at the conference if you would like to register. Portland is an intriguing city that offers a warm welcome to visitors, and the Doubletree is a green hotel with excellent accommodation. The conference room rate is only guaranteed until August 13, so make sure to book your accommodation soon.

Wave Goodbye

So Google's blog announced today that development of Google Wave "as a standalone product" will end because "Wave has not seen the adoption we would like". It's kind of a shame, because Wave was intriguing, but as an infrequent Wave user I found several issues that made it less than user-friendly. So here are a few points that developers might like to take home from the train-wreck that is Wave (100 developers for two years is a substantial investment, even for Google). The servers will continue to be available "at least until the end of the year".
  1. Don't try to replace standard GUI components with inferior and non-intuitive substitutes. The Wave scrollbar was a user interface disaster, and a source of frustration to many of the users I interacted with.
  2. Don't promote technologies that depend heavily on high-bandwidth connectivity, or at least not for Internet use. Many times I was left frustrated, not knowing whether the Wave had crashed or whether it was simply waiting for a server response.
  3. Realize that even the best technologies need marketing and publicity. 80%+ of desktop computer users don't use Windows because it's the best system, they use it because it's the best alternative they know about. If people don't know about your technology they won't use it, and techies alone probably aren't the right user base to make a product viral.
In some ways I am sorry that the Wave didn't succeed as so many techies apparently thought it would. In other ways I am disappointed that the technology didn't really deliver on its promise. It's one of those things that needs to be ubiquitous to succeed.

Google's misstep with Buzz earlier this year probably didn't help either - it led to distrust about Google's intentions with regard to (or, worse, competence at securing) users' personal data.

So whatever the next big thing on the Web is going to be, it isn't going to be Google Wave. RIP.