I copied the data into a spreadsheet for ease of manipulation, and did some munging to standardize presentation. I may have taken a few liberties here (such as replacing "California, Bay Area" with "San Francisco"), but I did so in an honest attempt to make sure that every important datum was counted. If the conference were to go to "San Francisco" the organizers would be very foolish to overlook other potential Bay Area locations, for example.
The Data
You can find the raw data I ended up with at http://holdenweb.com/files/wherenext.txt if you want to play about with them yourself. Here's a small section to give you the flavor:
New Jersey Palm Springs "Portland, OR"
New York "Portland, OR" "Portland, OR"
New York "Portland, OR" "Portland, OR"
New York "Portland, OR" San Diego
Northeast "Portland, OR" San Diego
Nowhere specific "Portland, OR" San Francisco
Phoenix Saint Paul San Francisco
"Portland, OR" San Francisco Seattle
The Program
#!/usr/bin/python
#
# Process PyCon feedback about future venues
#
# NOTE: Food for thought only ...
#
pd = {}
f = open("wherenext.txt")
for line in f:
places = [p.strip('"') for p in line.strip().split("\t")]
for rank, where in enumerate(places):
if not where in pd:
pd[where] = [0, 0, 0]
if where:
pd[where][rank] += 1
places = []
for where, [w1, w2, w3] in pd.items():
places.append((3*w1+2*w2+w3, where, [w1, w2, w3]))
for rank, where, scores in sorted(places):
print where, rank, scores
The placename of each pair is used to index a dict of [first, second, third] counts which counts the number of times a specific place was ranked in each position.
The next for loop is my favorite piece of this little program. It generates a new list of (weighted_score, placename, raw_scores) tuples ready to be sorted. The weightings I used (3 for a first choice, 2 for a second and 1 for a third) were completely arbitrary, so feel freee to change them in an attemto to fudge the results for your favored location.
Why did I like this particular loop so much? Since each item in the dict is a (placename, raw_scores) tuple I unpack the elements right in the for statement. It's such little elegances that endear Python to its fans: you can always do things in a straightforward way if you want, but as long as it doesn't interfere with readability I usually take advantage of such abbreviations.
I felt that a list comprehension would have been just a little bit too difficult to read, but you could easily replace the loop with one, and for further illegibility you could put it directly as the argument to sorted().
Finally I print out the results, with the lowest scoring first. Yes, I could have used a reverse() call to print out the favorite first, but then I'd have had to scroll the window back to know which city had "won". I could have formatted them better, too. Feel free, knock yourself out. I try to avoid over-specifying presentation when it's really the data I am interested in.
The Results
Given that the 2009 conference was held in Chicago I was unsurprised to see a strong vote for it, presumably by locals who might not be able to get to Atlanta, and certainly wouldn't find it as convenient. The only surprise was that it came in second rather than first! Portland, OR made a very strong showing, scoring only two points less, closely followed by Seattle with New York and Washington DC trailing rather further behind, But the clear winner was San Francisco, by coincidence the other location to make a strong bid for 2010 with Atlanta. Here are the results in their entirety.
Please remember that the data is from a self-selected group, and that this program is not binding on the PyCon organizers! Despite my best efforts I must have left a blank line in the data, so the empty string is recorded as having no scores! The if where condition should really have guarded the whole loop body, but it was an afterthought, and serves to remind us that sloppy design will lead to sub-optimal results.
0 [0, 0, 0]
Bay Area, CA 1 [0, 0, 1]
Cleveland 1 [0, 0, 1]
Houston, Texas 1 [0, 0, 1]
Knoxville, TN 1 [0, 0, 1]
Midwest 1 [0, 0, 1]
Oregon 1 [0, 0, 1]
Orlando, FL 1 [0, 0, 1]
Phoenix, Az 1 [0, 0, 1]
Pittsburgh, PA 1 [0, 0, 1]
Portland, ME 1 [0, 0, 1]
Twin Cities, MN 1 [0, 0, 1]
Virgin Islands 1 [0, 0, 1]
West coast 1 [0, 0, 1]
not next to an airport 1 [0, 0, 1]
vancouver 1 [0, 0, 1]
Boston Area 2 [0, 1, 0]
Cleveland, OH 2 [0, 1, 0]
East coast 2 [0, 1, 0]
Fort Collins, CO 2 [0, 1, 0]
Honalulu, HI 2 [0, 1, 0]
Kansas City, MO 2 [0, 1, 0]
Las Vegas, NV 2 [0, 1, 0]
Lawrence, Kansas 2 [0, 1, 0]
Los Angeles, CA 2 [0, 0, 2]
Manhattan 2 [0, 1, 0]
Minnesota 2 [0, 1, 0]
New York, NY 2 [0, 1, 0]
Palm Springs 2 [0, 1, 0]
Saint Paul 2 [0, 1, 0]
San Diego 2 [0, 0, 2]
Seatle 2 [0, 1, 0]
Toronto, ON 2 [0, 1, 0]
Vegas 2 [0, 1, 0]
huntsville, al 2 [0, 1, 0]
new orleans 2 [0, 1, 0]
Detroit, MI 3 [1, 0, 0]
East Coast 3 [1, 0, 0]
Europe 3 [1, 0, 0]
Houston, TX 3 [1, 0, 0]
Huntsville, AL 3 [1, 0, 0]
Kansas City 3 [1, 0, 0]
Madison, WI 3 [1, 0, 0]
Miami 3 [1, 0, 0]
Montreal 3 [1, 0, 0]
New Jersey 3 [1, 0, 0]
Nowhere specific 3 [1, 0, 0]
Phoenix 3 [1, 0, 0]
Raleigh/Durham, NC 3 [1, 0, 0]
Reno, NV 3 [1, 0, 0]
San Diego , CA 3 [1, 0, 0]
San Fransisco 3 [1, 0, 0]
San Jose 3 [1, 0, 0]
St. Louis, MO 3 [1, 0, 0]
Tucson, AZ 3 [1, 0, 0]
london 3 [1, 0, 0]
New Orleans 4 [0, 2, 0]
Colorado 5 [1, 1, 0]
Northeast 5 [1, 1, 0]
Canada 6 [2, 0, 0]
Las Vegas 6 [1, 0, 3]
Somewhere hot 6 [1, 1, 1]
Vancouver 6 [0, 3, 0]
Dallas, TX 7 [1, 2, 0]
Denver 7 [1, 2, 0]
Minneapolis 7 [2, 0, 1]
Toronto 7 [2, 0, 1]
California 9 [3, 0, 0]
Atlanta 10 [2, 2, 0]
Austin, TX 14 [1, 5, 1]
Boston 16 [2, 2, 6]
Washington, DC 16 [4, 2, 0]
New York 19 [3, 4, 2]
Seattle 27 [3, 7, 4]
Portland, OR 35 [5, 5, 10]
Chicago 37 [10, 2, 3]
San Francisco 45 [11, 5, 2]
2 comments:
using the system we used in the past of weights: (1.0, 0.8, 0.6) that would mean :
SanFran: 16.2
Portland: 15.0
Chicago: 13.4
Seattle: 11.0
The problem with SanFran? Cost. The San Fran bid for 2010 was almost 3x the cost of Atlanta. With the recession and Calf. economic problems, I doubt those numbers still hold though.
Looks like everyone wants to go to their own home town :)
Post a Comment