Friday, November 9, 2007

One of the (perhaps) unintended consequences of the internet is the democratization of knowledge. Vast amounts (especially of academic) information is available at the click of a button. Google makes access to the information easy.

This may not be a good think. Sharing information promotes groupthink. The collaborative filtering literature maybe a case in point. There appear to be two underlying psychological models being used.
  1. If you rate something similarly to someone else then you can use their predictions on unrated movies as an estimate of your rating for that movie plus some noise.
  2. Your preference for a movie consists of a set of (orthogonal?) preferences for different factors within the movie multiplied by the amount of that factor within the movie plus some noise - (if I've understood the matrix factorization stuff correctly).

The mathematics (and there is vast reams of it) seems to revolve mainly around calcuating similarity in the first instance or working out how best to tackle the noise in the second approach.

Staggeringly, and please correct me someone if I'm wrong, I can find absolutely no discussion whatsover (not even a single paper) that discusses the merits of the two psychological models underlined in 1 and 2. Yet, at best, they are very crude and, imho wrong. One must be able to do better.

Even the competition's organisers seem to think that only the mathematical sciences are important for this problem. I came across this quote from the CEO of Netflix in the New York Times.

"Mr. Hastings said he thought it was important to make the ratings database widely available. “Unless you work at Microsoft research or Yahoo research or for Jim Bennett here at Netflix, you won’t have access to a large data set,” he said. “The beauty of the Netflix prize is you can be a mathematician in Romania or a statistician in Taiwan, and you could be the winner.”"

No mention of psychologists, or even economics only mathematicans or statisticans. Has the computer science world got itself into one big groupthink on the problem?

No comments: