May 26, 2007

Natural Language Toolkit

I have just seen an announcement for the latest version of the Natural Language Toolkit. This is an excellent example of a useful open source software project and, as the announcement says:
It comes with 50k lines of code, 300Mb of datasets, and a 360 page book which teaches both Python and Natural Language Processing. NLTK has been adopted in at least 40 university courses. NLTK is hosted on sourceforge, and is ranked in the top 200 projects.
With this kind of software available for download, there will be many more unsuspecting Python users. The really nice thing, of course, is that the users aren't interested in using a particular language, they are interested in solving problems in a specific domain. The fact that Python lets them do this is a testament to its usefulness.

May 24, 2007

Goodbye Site Meter

Just goes to show, you can't be too careful. A while ago, wanting to know a little more about the traffic this blog sees, I added a widget from Site Meter to the layout.

Today I discovered that Site Meter have done a deal with a third party company, and that the code they send out includes references to pages, allowing tracking via cookies. So I have removed the widget and am looking for a new site metering technology.

It's not a huge security problem, but it means that you can be tracked across multiple sites that each use SiteMeter's logging widgets. This is allowed by their terms and conditions:
Site Meter may from time to time also authorize and facilitate the use of cookies from trusted third party business partners to gather and aggregate additional, anonymous, and non-Personal Information data from general internet visitors for the purpose of providing our customers with additional information about their viewing audience.
It's true that as long as SiteMeter don't release users' personal information there's nothing that can do over and above correlation of visits by anonymous users across multiple sites. But if one of those sites collects personal data and chooses to release it, that privacy is over. I decided I'd rather not subject my readers to that risk. If you have any such cookies in your browser I'd recommend you delete them. If you don't know how to do that you can find instructions at this About Cookies page.

May 19, 2007

Does Need a Better Navigation Bar?

Someone recently posted on that the site didn't make it obvious where to post a bug. So I snapped the navigation bar from the home page (though of course the navigation bar isn't necessarily where the feature should be added). How do readers think it could be improved?

[Note added later: clipmark feature was unsuccessful at capture, so I have replaced it with a PNG capture to better show the site]

May 17, 2007

Microsoft Strategy is Patently Ridiculous

A recent Fortune article, Microsoft claims software like Linux violates its patents, suggests that the Ballmer empire is about to start seeking royalties from users of open source software whihc, the company claims, violates 235 of their patents.

I don't think they have thought this through. The US Supreme Court has so far issued no ruling on whether software is even patentable, despite the Patent Office's ridiculous willingness to issue patents on techniques that fail even the simplest test of obviousness. When the most powerful software company in the world starts throwing its weight around to gain revenue from those patents it will force the issue somewhat.

The inevitable result will be a Supreme Court ruling that inevitably weakens, or even removes altogether, the protection that patents have been assumed to provide by those who have invested heavily in them. Microsoft senior VP Brad Smith claims, for example, that the Linux kernel violates 42 Microsoft patents.ourse the joke is that nobody has any idea how many patents Microsoft products violate because, unlike the open source projects Microsoft complains about, the code that comprises them isn't available for public scrutiny.

Python Slithers into Systems

Nice to see ITA, a PyCon sponsor, getting publicity for themselves and Python in eWeek: Python Slithers into Systems. ITA aren't exclusively programming in Python, I happen to know they use Lisp and at least one compiled language as well. This demonstrates yet again that Python is a pragmatist's language, and it's definitely rising in visibility.

Another Great Python Blog Entry

I've been following Doug Hellman's Python Module of the Week series, but I already know most of the modules he'd covered until along came PyMOTW: logging which describes a module I have always found difficult to understand in terms that make it comprehensible. Nice job!

Wyatt Baldwin Blog - Google Maps

Wyatt Baldwin has recently made a couple of interesting blog entries (I'm just catching up after an extended Windows repair session [spit, spit]. In Google Maps Encoded Polylines he uses Python to draw shapes on Google Maps, and in Creating a (Google Maps) Tosca Widget he explains in considerable detail just how to do that. Both great posts demonstrating Python's power, and much kudos to Wyatt for this excellent work.

May 16, 2007

Debian -- python-samba

I wonder what's happening with python-samba. I came across the Debian package while looking at a Samba bug notice, but so far I haven't heard about usage. The package description does say
At the moment their status is "experimental" but they have been reported to work well.

May 15, 2007

Rethinking the Linux Distribution

Just one or two quotes from this interesting piece.

Many well-known Linux distributions already use Python in their key tools. Red Hat's Anaconda installer, and Gentoo's Portage package manager are two examples. Ubuntu (the top distribution for the last 12 months, according to DistroWatch) "... prefers the community to contribute work in Python."


"Among the high level languages, Python seemed to be the best choice, since we already use it in many places like package build scripts, package manager, control panel modules, and installer program YALI. Python has small and has clean source codes. Standard library is full of useful modules. Learning curve is easy, most of the developers in our team picked up the language in a few days without prior experience."

So why aren't you using Python? (Don't tell me, you are ...)

May 3, 2007

Writerly Advice

Kurt Vonnegut died recently, but one of his legacies is this advice for writers. We could all read it and benefit from it, but I hope it will be of particular use to anyone responding to my PSF blog entry encouraging people to write Python articles for Sys Admin magazine.

May 2, 2007

Open Source "Increasingly Used for Critical Applications"

The market research company Forrester has just issued a report saying (among many other things) that more than 75% of respondents "agreed that open source software was making an important or very important contribution to improving efficiency and concolidating IT infrastructure". Other highlights include the fact that concerns about intellectual property are fading (presumably as SCO's cases are seen more and more clearly to be the FUD of a desperate last-ditch bid for survival).

The closing advice for would-be adopters comes in two major chunks:

1. Lower Internal barriers to Open Source Adoption - clearly it's time for policies to be refined to give open source a level playing field against proprietary products; some organizations still have blanket bans on the evaluation, let alone use, of open source.

2. Identify Services Before You Commit - many corporate users appear to be concerned that appropriate support services don't exist for open source products. While this isn't universally true, the open source communities equally need to acknowledge that corporate users might not feel completely comfortable relying solely on volunteer newsgroups for support.

It would be good to start building international federations that can collectively offer 24/7 support for open source projects, with specified service levels. That's going to be a challenge for the open sourcerers, but if they can solve it then they might even manage to get on the gravy train before it leaves the station. While this may not be everyone's dream it would be nice to channel funds in directions that allow (or even encourage) the development more and better open source software.

DRM Saga Continues: No Surprises

Well, the content protection debacle grinds on. The users continue to be unimpressed with digital rights management, and the hackers and crackers continue to break the harebrained industry schemes with monotonous regularity. Digg temporarily tried to censor material as a result of a blanket cease-and-desist threat, got it in the neck from their users, and apparently learned their lesson.

Favorite quotes from the latest discussion:
As Joe Rogan's character on Newsradio once quite accurately quipped, "Dude, you can't take something off the Internet.. that's like trying to take pee out of a swimming pool." The content providers have attempted to do exactly that, remove pee from the proverbial swimming pool that is the Internet and, as we've witnessed so many times before, they've failed miserably. [1]
There isn't a single known DRM system worth cracking that hasn't been cracked, multiple times; AACS will likely be no different. [2]
When will the recording and film industries learn that the "lost income" from content pillaging by dishonest consumers isn't "lost" at all? It's what the retail trade calls "wastage" (some things arrive spoiled, some get spoiled, some are stolen). Their revenues are what their revenues are, and the overall level of dishonesty is what it is. The prospect of additional billions is illusory, because people who don't pay for content will simply stop consuming it if they can't get it for nothing.

So the joke is that if the industry ever achieved their total-protection nirvana they will have shot themselves in the foot, because even bootleg copies are positive marketing. Sometimes greed is so sad. Were I a stockholder in RIAA member companies I would be furious at the waste of effort. I wonder if the senior staff of studios actually pay for the DVDs they take home. If not, aren't they too stealing from their stockholders?