Thursday, August 19, 2004 TruthIsAll: The significance of national and state polls and the probability of a Kerry win Regarding your comments on the significance of polling numbers and probability of a Kerry win, here is a post by TruthIsAll regarding same:Election Model Methodology: A TIA Primer http://www.geocities.com/electionmodel There are three basic methods used for analyzing elections. The first is to analyze the effects of external factors: demographic trends, the economy, jobs, inflation, etc. A number of forecasters claim various degrees of success based on historical backtracking. This is essentially an econometric approach and the forecasting model uses multiple regression analysis of various factors to come up with a multivariable mathematical formula to predict the vote. I have seen a few of these from mediablessed "experts", and they are, in my opinion, way out of line. A second approach is to analyze the latest national polling trends, keeping in mind the potential movement of undecided or third party voters. There are about 15 or so major pollsters. This is a popular vote forecast. Note that predicting a majority vote does not mean that the winner will gain 270 electoral votes  but this is true only in extremely tight elections. In fact, in a 5149 split, there is virtually zero chance that the popular vote winner would lose the Electoral College. It didn't happen in 2000, because Gore won Florida (no need to dwell on this) and also received 600,000 more votes than Bush nationwide. A third method uses state polls to predict electoral votes, taking into account the same caveats as the national polls. The focus, of course, is on the battleground states, of which there are perhaps 20. But I calculate probabilities for all states, from Kansas (what's the matter with Kansas?) and even Massachusetts, home of wildeyed liberals  like John F. Kennedy (sarcasm off). In my model, I used methods two and three. I am not an economist, and I believe that polls are pretty good indicators, provided they are fresh and nonbiased. So why not just use the information that the real voters are telling us? The advantage to national polling is that we come up with a single number, the spread between the candidates. Now if this number exceeds the polling MoE, which is typically +/3% for nationwide polls of 1000 or more individuals, statistical theory tells us that the leader has at least a 95% chance of winning the election. But that is for just one poll sample of 1000 voters. If we have 3 polls (3000 sampled) with an average Kerry 5248% spread, the MoE is only 1.80%. This means that in 95 out of 100 elections, Kerry will receive between 50.253.8% of the vote. In other words, he has an approximate 97%+ probability of winning the popular vote. For fifteen polls, the MoE is an even tighter 0.80%. This means that for the same 5248% poll average, Kerry will receive between 51.2% and 52.8% of the vote 95% of the time (19 out of 20). The probability of Kerry EXCEEDING 50% is 99.99+%. That 95% confidence interval around the mean comes right out of Statistics 101. The MoE is approximately 1.96 times the standard deviation, a measure of the volatility of sample deviations around the mean. The standard deviation is a component of the normal distribution. It is used to determine confidence limits around the mean, as well as for calculating the probability of winning majority (state and national) vote. Kerry currently has a 6point average lead in my 11 poll national group. He has a 4.2% edge in the 15 poll group, which includes the eleven as well as CNN, AP, FOX and NBC. His probability of winning based on the both poll group averages is 99.99+%. Ultimately, the winner must get at least 270 electoral votes. So how do we calculate probabilities using state polls? The same way as before. We calculate probability of winning a state the same way as the national vote. The difference is that state polls typically sample only 500600 individuals, so the MoE is now a wider 4.0% for each state. Kerry has for each state a win probability based on the latest poll. In a 5050 poll split, both candidates have an equal 50% probability of a win. But if the split is 6040, the probability of the leader winning is 99.999%. It's when the race is close (say 5149%) that things get interesting. In this case, the leader has a 69% chance of winning the election. For a 5248 split, it is about 83%. For 5347, it is about 97%. So, to determine the probability of winning at least 270 votes, we use Monte Carlo simulation. For each state, the model generates a random number between 0 and 1 which is compared to the probability of Kerry winning the state. For example, assume the random number is .55 and Kerry has a 60% probability of winning a state. Then he wins the state, since .55 is less than .60. Otherwise (say the random number is .8) the state goes to Bush. The model generates random numbers for each state and assigns the state's electoral votes in this fashion. The full simulation runs 1000 election trials. If Kerry wins 980 times (he gets at least 270 votes), then we can say that he has a 98% chance of winning. Kerry's average electoral vote for the 1000 elections is also calculated. Right now, Kerry is winning about 98% of the trials with an average of 330 electoral votes. One advantage to this method is we don't get poll "whiplash" as close states change hands daily based on slightly changing polls. So there it is. Hopefully, this will clarify the methodology. By using both national and state models, there is a warm feeling of mathematical confirmation. In any case, the main lesson is this: analyze as many polls as possible and recognize the fact that doing this reduces the overall margin of error. We have more confidence in the results of a group poll mean than a single poll. The mediawhores like to quote that damn CNN poll. They ignore the rest. A final word, perhaps ine I did not emphasize. The analysis assumes that the election is held the day after the model is run. I cannot predict Armageddon, martial law, canceled elections. But most of all, I cannot predict UNFAIR, STOLEN ELECTIONS IN CYBERSPACE. I assume a fair election. Today, that is a very risky assumption. 

