April 15, 2006

Test-Driven Development

Some code, for a change. I recently taught an introductory Python class to some fairly experienced programmers, and we had an hour or so left at the end of the class to try a problem. We'd been discussing test-driven development, so we arrived at the idea of creating a problem that was fairly simple in scope and then writing tests and a solution to the problem.

The idea was to write the tests first, though it will be no surprise to those who've done this before that the nature of the tests changed as errors came to light during development. The problem was as follows:
Given a directory structure of arbitrary shape, locate all JPEG images and copy them into a named destination directory. [Not specified but implied: the files should continue to exist in their original positions]
It turned out that the level was fairly well chosen. None of the students managed to complete the task, but they all had a fairly clear sense of where they were going by the end of the exercise, giving them something to work on independently after I'd gone. Of course I had to provide them with a "model solution", which I'm happy to say I just managed to create in the time allotted.

Here is the test harness for the jpegcopy requirement:
import jpegcopy
import unittest
import os

BASEDIR = '/c/Steve/Projects/BrightonHove'
BASEDIR = 'c:/Steve/Projects/BrightonHove'
INDIR = os.path.join(BASEDIR, "input")
OUTDIR1 = os.path.join(BASEDIR, "output1")
OUTDIR2 = os.path.join(BASEDIR, "output2")
EXPECTED = ['%s.jpg' % s for s in "f1 f2 f3 f4 f5 f6".split()]

class TestJpegCopy(unittest.TestCase):

def setUp(self):
"""Ensure both output directories are empty."""
for d in OUTDIR1, OUTDIR2:
fl = os.listdir(d)
if fl:
for f in fl:
os.unlink(os.path.join(d, f))
raise ValueError, "Cannot empty directory %s" % d

def testEmpty(self):
n0 = jpegcopy.main(OUTDIR1, OUTDIR1)
self.assertEquals(n0, 0)

def testDir1(self):
n1 = jpegcopy.main(INDIR, OUTDIR1)
self.assertEquals(n1, 6)
self.assertEquals(os.listdir(OUTDIR1), EXPECTED)

def testDir2(self):
n2 = jpegcopy.main(INDIR, OUTDIR2)
self.assertEquals(n2, 6)
self.assertEquals(os.listdir(OUTDIR2), EXPECTED)

def tearDown(self):
for d in OUTDIR1, OUTDIR2:
for f in os.listdir(d):
os.unlink(os.path.join(d, f))

if __name__ == "__main__":
Nothing too fancy here. The tests are parameterised. We set them up by clearing both the output directories. Then we test that they are indeed empty. Then we test to make sure that we can put the JPEGs into two different directories and verifying that each time we see six files copied. Finally we check that both output directories contain the same thing. We tear down the test by deleting the contents of both directories.

This will probably show my ignorance, highlighting the fact that test-driven methods don't yet come naturally to me. I'll be happy to integrate suggestions for improving test coverage. My solution follows.
"""Copy jpegs from a recursive to a flat directory structure."""

import os
import shutil

def main(indir, outdir, debug=0):
count = 0
for f in os.listdir(indir):
if os.path.isdir(os.path.join(indir, f)):
count += main(os.path.join(indir, f), outdir, debug=debug)
if f.endswith(".jpg"):
count += 1
shutil.copyfile(os.path.join(indir, f),
os.path.join(outdir, f))
if debug:
print "Returning %d for %s" % (count, indir)
return count

if __name__ == "__main__":
main("input", "output1", debug=1)
As you can see I have put a simple test inline; this script is not intended to be run as a main program, but the debug output was useful sometimes when tests failed for obscure reasons.

Again, if readers can suggest improvements I'll incorporate them as I have time. You should be able to copy the code from your browser window and paste it into an editor. Thanks to the MoinMoin developers for colorize.py.

1 comment:

Anonymous said...

The only suggestion I'd make is to make sure that only the files you expect are actually in your INDIR. At the moment if the contents of this directory change then your tests will fail.

You can make your test set up even more explicit by creating the expected files using something like PIL or by creating a single file, writing some random data to it and then copying it the appropriate number of times.