I'll post some reflections on the Netflix prize at a later date, but as someone who has been with the competition since the beginning I thought it might be useful to explain why the second place team on the leaderboard Bellkor's Pragmatic Chaos are almost definitely the winners.
The reason is that there are two datasets against which every competitor is judged. The first is the Quiz dataset - the results of which are reported back to the competitors and appear on the leaderboard and a second dataset which is called the Test dataset which is actually used to determine the winner. The purpose of this is to stop what is called "overfitting", i.e. using the results you achieve on the Quiz dataset every time you make a submission to figure out the actual values. Now with 1.5 million datapoints its impossible to figure out each value, but I'm sure both teams used some of the information from the results on the quiz dataset to work out the optimal combination of numbers to contribute - Given that the teams are separated by less than a one point difference in the fourth decimal place only a very small amount of overfitting could cause the positions to switch and in this case it looks like Bellkor's Pragmatic Chaos overfitted slightly less than "The Ensemble" and hence according to the posts on the Netflix forum are the ones to be validated.
There is a final stage that needs to be gone through and that is validation where the top team have to demonstrate how they achieved the results and to publish how they did it - but given that Bellkor have been through this twice before on the progress prize it should be a formality.