Archive for August, 2006

WordPress upgrade

Thursday, August 17th, 2006

I just upgraded WordPress to 2.0.something. Unfortunately, this wiped out my customized theme. I’m going to have to fiddle with that sometime soon. Oh how I hoped it would all stay intact. It did on the test case.

UPDATE a few minutes later: Phew. That was not nearly as painful as I feared. I think it’s all changed now except for a few tiny details here and there perhaps. The tag links don’t work at all, but they didn’t work before either. I keep adding tags, because the tag cloud representation does seem to work and I think it is funky.  The links don’t work there either, obviously, since they’re the same links.

Speed of speech and its implications

Thursday, August 17th, 2006

The NYTimes decided to report on the extent to which Hungarians are better than Americans at recalling store prices. Given that most blogging I do about Hungary seems to result in a discussion of the Hungarian language and given that the authors explain the findings based on language differences, I thought I’d take this opportunity to address the issue head on.

Let’s start with the findings:

Hungarians are far better than Americans at recalling long prices; on average, they can recall 19 to 24 syllables with decent accuracy, while Americans can recall only 13. The authors suggested that this was because Hungarians speak 41 percent faster, both out loud and when repeating sounds to themselves “subvocally.”

The NYTimes piece ends right there. That’s not fair, the author left out the most interesting part: how do we know how fast Hungarians speak in comparison to Americans?

Read the rest of this entry »

Links for 2006-08-17

Thursday, August 17th, 2006

Data sources

Wednesday, August 16th, 2006

Behind the hustle and bustle of the book exhibit at the recent annual meetings of the American Sociological Association was an exhibit of various data sources. That area of the room is usually very quiet. As a break from everything else, I decided to take a little tour. The posters and flyers are actually quite informative. It seems to me that this is an underappreciated part of the meetings and could be especially helpful for graduate students. Of course, it should hold value to many others as well.

In addition to data sources, there are pointers to various tools and also reports that may be of interest. Much of the material on these Web sites is presented in a way that it should be accessible and interesting to many non-specialists as well. The teaching potential of some of these sources is considerable as well.

  • Wisconsin Longitudinal Study – “[..] a long-term study of a random sample of 10,317 men and women who graduated from Wisconsin high schools in 1957.” In the interest of full disclosure, I have a pilot grant from this project and have been working with the data set for the past few months. It’s an amazing resource.
  • Social Explorer – “Social Explorer is dedicated to providing demographic information in an easily understood format: data maps.” – I may have linked to this before. This resource in particular may be especially helpful for teaching purposes.
  • WebCASPAR – “[..] provides easy access to a large body of statistical data resources for science and engineering (S&E) at U.S. academic institutions. WebCASPAR emphasizes S&E, but its data resources also provide information on non-S&E fields and higher education in general.”
  • National Science Board Science and Engineering Indicators 2006 – “[..] a volume of record comprising the major high quality quantitative data on the U.S. and international science and engineering enterprise.”
  • Archival Research Catalog – “The Archival Research Catalog (ARC) is the online catalog of NARA’s [NARA = National Archives and Records Administration] nationwide holdings in the Washington, DC area, Regional Archives and Presidential Libraries.” The ARC Guide for Educators and Students is a good place to start.
  • The American Time Use Survey – “measures the amount of time people spend doing various activities, such as paid work, childcare, volunteering, commuting, and socializing.”

Soda alternative

Wednesday, August 16th, 2006

I’m not about to cut chocolate out of my diet, but it would be nice to reduce calory intake somehow. A while back I decided to give up drinking sodas. I haven’t succeeded 100%, but I have gotten pretty good over time. I used to consume a can of Coke several times a week with an occasional Sprite thrown in there as well. Now I only have such a drink once or twice a month.

When I first mentioned this to a friend, he said this should add up to considerable weight loss. I found that interesting and intriguins since it’s not a particularly painful way to keep extra pounds off. This week’s Time magazine Numbers feature has some concrete information about this:

    15 Number of pounds that a person would gain annually by drinking an extra can of sugar-laden soda each day

I certainly have not lost 15 pounds by not drinking soda, but I wasn’t drinking it daily and I haven’t cut it out 100%. Still, it’s a helpful figure to contemplate.

I have gotten better about drinking water, but I have also discovered a nice alternative. (And I’m hopeful no one on this blog will point out to me the downsides of said alternative, but go ahead, enlighten me.) To add a bit of taste to my beverage, I add a tiny bit of lemon juice to the water. No sugar or anything else, just a bit of lemon juice. It works well, I recommend it.

Links for 2006-08-16

Wednesday, August 16th, 2006

Links for 2006-08-15

Tuesday, August 15th, 2006

Links for 2006-08-13

Sunday, August 13th, 2006

Links for 2006-08-11

Friday, August 11th, 2006

Orange-alert air travel

Thursday, August 10th, 2006

Airport security Perfect day to travel internationally.. not. It was interesting to watch the myriad of items accumulating in the bins scattered alongside the security line. There seemed to be some interesting perfumes in there (well, at least the containers looked interesting), otherwise, just a bunch of half-empty water bottles, toothpaste, shaving cream and lotion. I wondered whether they would let you take an empty water bottle in, but I decided not to test the system. The wait was longer than usual, but still not impossible (this in the Premier check-in area though). I was also curious to see whether the hotel would be ready for the numerous people showing up without toothpaste. Having forgotten French for toothpaste, I mumbled something about brushing teeth, but before I could finish the sentence, the concierge handed me a small tube. Good for them. (Yes, of course I could’ve asked in English, but what’s the fun in that?)

Montreal welcomes the ASA As for getting through passport control, I continue to be unimpressed by Canadian immigration officials. After greeting the guy with a friendly Bonsoir I was asked why I was visiting. I mentioned the sociology meetings, which was only so obvious given that even the official greeting signs at the airport had the ASA written on them and at least half my flight was sociologists. (When I assumed about the couple standing next to me a minute earlier that they were here for the ASA they asked if it was that obvious. Isn’t it?) Anyway, the passport control guy got on the offensive to push me on “what about the sociology meetings”? What about them? I’m giving some talks. I wonder if he was that combative with the Americans. (Don’t bother getting on my case about how this doesn’t sound combative. It was, perhaps you had to be there.)

In any case, the city looks neat from my 23rd floor room. I look forward to exploring it this weekend.

Links for 2006-08-10

Thursday, August 10th, 2006

Links for 2006-08-09

Wednesday, August 9th, 2006

Links for 2006-08-08

Tuesday, August 8th, 2006

The AOL data mess

Monday, August 7th, 2006

Not surprisingly this is the kind of topic that spreads like wildfire across blogland.
AOL search data snippet

AOL Research released (link to Google cache page) the search queries of hundreds of thousands of its users over a three month period. While user IDs are not included in the data set, all the search terms have been left untouched. Needless to say, lots of searches could include all sorts of private information that could identify a user.

The problems in the realm of privacy are obvious and have been discussed by many others so I won’t bother with that part. (See the blog posts linked above.) By not focusing on that aspect I do not mean to diminish its importance. I think it’s very grave. But many others are talking about it so I’ll focus on another aspect of this fiasco.

As someone who has research interests in this area and has been trying to get search companies to release some data for purely academic purposes, needless to say an incident like this is extremely unfortunate. Not that search companies have been particularly cooperative so far – based on this case not surprisingly -, but chances for future cooperation in this realm have just taken a nosedive.

To some extent I understand. No company wants to end up with this kind of a mess on their hands. And it would take way too much work on their part to remove all identifying information from a data set of this sort. I still wonder if there are possible work-arounds though, such as allowing access on the premises or some such solution. But again, that’s a lot of trouble, and why would they want to bother? Researchers like me would like to think we can bring something new to the table, but that may not be worth the risk.

Note, however, that dealing with sensitive data is nothing new in academic research. People are given access to very detailed Census data, for example, and confidentiality is preserved. From what I can tell the problem here did not stem from researchers, it was someone at AOL who was careless with the information. But the outcome will likely be less access to data for all sorts of researchers.

Another question of interest: Now that these data have been made public what are the chances for approval from a university’s institutional review board for work on this data set? (Alex raises related questions as well.) Would an approval be granted? These users did not consent to their data being used for such purposes. But the data have been made public and theoretically do not contain any identifying information. Even if they do, the researcher could promise that results would only be reported in the aggregate leaving out any potentially identifying information. Hmm…

For sure, this will be a great example in class when I teach about the privacy implications of online behavior.

Not surprisingly, people are already crunching the data set, here are some tidbits from it.

A propos the little snippet I grabbed from the data (see image above), see this paper of mine for an exploration of spelling mistakes made while using search engines and browsing the Web. About a third of that sample was AOL users.

The image above is from data in the xxx-01.txt file.

Scrollable ads

Monday, August 7th, 2006

GMail does something very smart with the Sponsored Links it displays in the Webclips area just above the message view area, it lets the user scroll back and forth among the ads.

Maybe I’m an odd one for actually looking at ads on occasion, but sometimes they do tell you about helpful or interesting information and services. So I like to click on them sometimes. However, more often than not, I just glance at them in the corner of my eye as I am about to move to another page. What then happens is that the ad changes. In GMail, I can just click on the back button in Webclips and get the ad (or whatever RSS feed I may have missed).

GMail Webclip

On most sites this is not possible (e.g. Yahoo! Mail). If you click the back button of your browser, chances are that some other ad is dynamically generated on the page you were just viewing by the time you return to it. It’s a bummer as some of those ads could be of interest to users a split second later.

Lowering the least bloggable unit

Monday, August 7th, 2006

I think I’ve been putting too high a threshold on the least bloggable unit* around here recently (although some may disagree). That is, I have all sorts of thoughts on IT and other matters that I could blog about, but I don’t bother, because I don’t have that much to say. There are also time constraints. More serious thoughts and posts require more time and needless to say time is limited around here.

So this is just to say that I may start posting more often, but in smaller chunks.

* Interestingly, it turns out that the phrase “least bloggable unit” has been used once in blog world so far: on Crooked Timber of all places in a comment by Sean Carroll.

Links for 2006-08-07

Monday, August 7th, 2006

Links for 2006-08-05

Saturday, August 5th, 2006

Links for 2006-08-04

Friday, August 4th, 2006

Without pain on a plane

Thursday, August 3rd, 2006

I am back from my trip to Argentina mentioned earlier and am happy to say that the long flight didn’t mess things up too much. I suspect the lack of time-zone change from Chicago to Buenos Aires helped quite a bit, but I would like to think my master preparedness was useful, too.

I did end up taking an hour-long nap after I got to Buenos Aires, but then was well-equipped to spend a good chunk of Saturday exploring the city. And what a fabulous city it is! It was my first time in Argentina, but after this visit I am convinced it was not the last. (The first batch of photos is available on Flickr now. More coming soon.)

As a side note on how some people try to make a long-distance relationship work, consider the story of the person sitting next to me on the flight there. He works in DC, but has a wife and young child in Argentina. Twice a month he gets on a plane Friday evening for the ten-hour flight to Buenos Aires to spend less than 48 hours with his family returning Sunday night so he can be back at work on Monday morning. Ouch.

Here is a list of ways to minimize fatigue generated by long flights, many drawn from responses to this post. I ran out to buy noise-canceling headphones after so many people recommended them. Great idea, I am convinced they made a huge difference!

  • noise canceling headphones (and/or earplugs)
  • water
  • eye mask
  • nasal spray (to counter dry air)
  • a bit of reading/game
  • easily accessible pen (so you can fill out immigration/customs paperwork whenever you want)
  • some type of sleeping pill (either over-the-counter or prescription)
  • at most a small item underneath the seat in front of you
  • an extra sweater/coat and the blanket they give you
  • resisting the need to eat everything you are served
  • small snacks (both sweet and not) so you can eat when you want
  • occasional stretching
  • in case of annoyances, a bit of meditation to block out the environment
  • resisting to watch several movies
  • aisle seat if you want freedom to move (but only if you don’t mind the chance of being bumped by the flight attendants and passersby), window seat if you want to use the side of the plane as a headrest (but only if you don’t mind the cold and having less access to movement)
  • adjusting headrest to avoid leaning/falling on neighbor
  • getting legs up (perhaps on small piece of luggage) for improved circulation
  • a good night’s sleep the night before

On the way back I got upgraded to business class so other than a bit of fatigue, the adjustment took even less out of me.