Semester Recap
My usual reaction to the end of the semester is to recoil in horror, dumping notes in desk drawers and Documents directories, to be picked up weeks later if ever. This semester I’m taking a different approach and reflecting a bit before I flee. A lot happened this semester, so I figured I’d write about it here so at least I can search for it later.
I began the semester at the Pallini Beach Hotel in Chalkidiki, Greece, where I was attending the Summer School on Multimedia Semantics. The trips to and from beautiful Chalkidiki were exhausting, but the summer school was worth the effort. It was basically a series of tutorials on multimedia computing (focused on signal processing, content analysis, and machine learning applied to audio, video and images) and semantic computing (focused on knowledge representation, ontology learning, metadata exchange, and web services) with an effort made to show complementarity between the two. There was little mention of semiotic approaches (except for Lynda Hardman’s talk), but I’m familiar with those, so I felt that the summer school succeeded in rounding out my knowledge of what (mostly European) computer scientists are calling multimedia semantics. I’m not sure I buy some of the research programs presented–though I envy their massive EU budgets–but at least I find the research interesting.
I arrived back in Berkeley September 10th, nearly 2 weeks after classes had begun. I ended up taking Quantitative Methods, Practical Machine Learning, Quality of Information, Online Journalism, and the I-School PhD colloquium. These were uniformly good choices.
Online Journalism was an experiment conducted by citizens’ media evangelist Dan Gillmor and Yahoo! senior editorial director Bill Gannon. In the first few weeks of the semester, we built VoteGuide, a site which attempted to aggregate all possible information about a single Congressional race. Most of it was built in a week, as we rushed through designing and coding to get something up in time to actually allow some use before the election. We launched in mid-October, a few weeks before Election Day. We didn’t really manage to aggregate everything, though we had a lot. And we didn’t succeed in our goal of integrating local citizen-contributed and professional journalist-produced content, mainly because there wasn’t enough time to build community, but also because we hadn’t thought carefully enough about how the site would fit into existing practices of any of the relevant groups: voters, campaign workers, etc. But it was intended as a learning experience, so I still consider it a success. We’re currently thinking over how it could be streamlined and improved for next time.
Shortly after VoteGuide launched, Patrick and I headed to Santa Barbara to attend a workshop on Human-Centered Multimedia at ACM Multimedia 2006, where we presented a paper on our analysis of the International Remix project we did last spring. The conference was fun and Santa Barbara was beautiful, and it was good to see some friends in distant lands. The paper and poster were well-received, and I felt encouraged by the positive feedback. But both Patrick and I have been a bit frustrated by a lack of time to do further experiments in this direction. Partially this is because we have been occupied with building things at YRB, but we’re also both spread very thin, which has its advantages and disadvantages. Being unable to make progress on a line of research is one of the latter.
Immediately upon returning from Santa Barbara, I flew north to Vancouver for the Annual Meeting of the Society for Social Studies of Science, AKA 4S. By this time I had a nasty cold, and it was cold and rainy in Vancouver. I was there with Megan, Yuri, and Arthur to present on a panel about decay. My talk was entitled Composting the Archives: The Case for Digital Decay. Only three people showed up (not counting us), but it still felt worthwhile thanks to the great questions and discussion we got from those three, especially Martin Hand. It was also great seeing Arthur again, though we had little time to socialize. I flew back to Vancouver before 4S ended, so I could attend the second day of the Unblinking Symposium back in Berkeley.
I was really disappointed that I had to miss the first day of Unblinking (especially the dinner), because the participants and their talks were fascinating. The discussion was one of the best I’ve participated in, covering a wide range of topics yet focused and coherent. And somehow, there was a lack of the typical interdisciplinary communication problems that often mar such discussions. Hopefully there will be something published that comes out of it–I’d love to revisit some of the ideas presented with some more time to think about them. My presentation, as I mentioned earlier, was on Recognition Markets and Visual Privacy. Again, I got positive if brief feedback. I’d like to do a another round of revisions on the paper, hopefully I can get to that soon…
At this point I was pretty exhausted but a bit of rest and Democratic victory gave me the energy to finish out the semester. For Paul Duguid and Geoff Nunberg’s Quality of Information class I wrote a paper on Citizendium, Larry Sanger’s fork of Wikipedia. At first I was skeptical of Citizendium, and my paper, which looks at how members of the Citizendium mailing lists conceptualize expertise and its role in knowledge production, echoes that skepticism. But having recently seen the first article to pass Citizendium’s peer review process, I am cautiously excited about the project. It really is a great article–the organization and quality of writing are noticeably superior to those found even in the featured articles at Wikipedia (which I have spent a lot of time looking at recently; see below). Critics will argue that one article does not a compendium make, and they would have a point. Still, I am interested in seeing how Citizendium progresses and maybe getting involved if I have some time, perhaps trying to do some work on areas related to my PhD qualifying exam topics.
For Practical Machine Learning I worked with Jon Lesser and John (Jer-Yee) Chuang on a classifier for Wikipedia articles. We used the Wikipedia community’s judgments on bad, good, and featured articles (which we pulled down via the MediaWiki API) to train a number of different kinds of classification algorithms (using Weka). We had the best results with a random forest algorithm, which had excellent performance in cross-validation, correctly classifying all the good and featured articles and misclassifying only a few of the bad articles. The most important features for classifying an article turned out to be the number of links to and from the article on Wikipedia, the length of the article, the number of revisions, and the number of images. This was admittedly a rush job, but our preliminary results were encouraging, so I’d like to do some more work on this and see if we can build something a bit more robust.
Along the way, I was lucky enough to have the opportunity to review Fred Turner’s From Counterculture to Cyberculture for Business History. I really enjoyed the book, which I picked up at the beginning of the semester and read while avoiding other work. So it was fortunate that they were looking for someone to review it. It is scheduled to appear in print sometime next fall, so I have something to look forward to.
All that stuff wrapped up December 12, since when I’ve been busy at YRB, cooking up some stuff for an internal project, which hopefully will have a public face sometime in 2007. More on that later. In the meantime, this blog may be pretty quiet for a while, as Yuki and I will be spending the holidays traveling to and from Atlanta, Utsunomiya, Aomori, Matsushima, Tokyo and Honolulu…

