## Yes, We Could.

With less than two days left to go, this is my last post on statistical analysis and predictions on the Italian elections. In the previous posts I had a look at the possible outcome of the race for the Senate, based on available polls data. I will try today to have a reflection on two issues:

1) Is it conceivable that, contrary to all published polls, PD would eventually end up with more votes that PdL, thus securing the majority in the lower chamber?

2) In such a case, what would be the result, in term of seats, at the Senate?

After the polls blackout (15 days before the election date), results from putative reserved polls can be found on the net, in various flavors and disguises (horse races being a rather popular choice, but *not the only one*). Since I have no way to check how real and reliable these are, I would ignore them and base myself on available data alone, plus some elementary statistics. Only in the end I will add some more “political” consideration.

Starting with question one, the obvious starting point is a comparison with the 2006 elections, in which the polls predictions disagreed strongly with the final result. I show in the plot below the results of the pre-election polls and of the exit polls in 2006 , compared with the 2008 polls. I report as well the final results of 2006, and a hypothetical break-even point for 2008 (PD equals PdL).

In 2006, all polls gave a substantial margin to the Center-left coalition (average in the last month before the blackout 4.8 %), except from 3 polls from 2 institutes, all commissioned by Berlusconi. Such margin was essentially the same even in the exit polls, but was dramatically contradicted by the final results.

At a first look, a PD close victory in 2008 would be an even bigger surprise than the “almost draw” of 2006. The absolute discrepancy of the final data from the last month polls average (7.7 %) would be higher than the one of 2006 (4.9 %), and even more so in relative terms (since PdL and PD percentages are lower than what Center-left and Center-right obtained in the last elections). However, for several reasons, a PD victory cannot be completely ruled out. First, the data looks more “dynamic” in 2008. Besides that, the spread of the data in 2006 is noticeably less than the one of 2008 (especially if the 3 “anomalous polls” are not considered) – possibly due to the “disturbing” factor of the smaller parties now present in the race. If we consider only the last month before polls blackout in 2006, the final Center-Left and Center-Right results were -3 and +3.8 standard deviations away from the averages, respectively. This compares to the +/- 3.8 standard deviations needed to reach the break-even point in 2008. From a purely statistical point of view the two events would be essentially equivalent (i.e., equally unlikely).

The cause of the 2006 discrepancy had been thoroughly discussed. A systematic bias in the polls looks more likely than a drift of electoral preferences in the last two weeks (as confirmed also by the exit polls result). The bias could then arise from non-representative sampling by the polling institutes, reticent answers by part of the interviewed population, or a biased voting distribution in the undecided who finally participated to the elections. The repetition of such a discrepancy in 2008 (and its magnitude and direction) depends on all the above as yet undecided issues. To add another variable, I ignore if the polling institutes had since 2006 put in place corrective strategies, and which ones.

The only possible conclusion is therefore that, if a similar phenomenon happens again, there would be ample space for a PD victory at the lower chamber (*but also*, potentially, for a large defeat).

In order to try and answer to question two – Senate outcome if PD overtakes PdL – I have done a simple exercise. I have taken the March 2008 polls average and used my model (derived by the one from Sandro Brusco/noiseFromAmeriKa) to study the dependence of PdL and PD seats in the Senate as a function of the PdL-PD distance (keeping the PD+PdL sum constant). I have then repeated the simulation two more times, first adding, then subtracting 1% to both SA and UDC.

In the following plot I show the results, together with the simulations done using the actual March polls (scattered dots). It is interesting to note that the two +/- 1% bands (thinner lines) describe well the variability of the polls.

At a first glance, the conclusion is that a small advantage of PD over PdL would in general not be enough to obtain the majority in the Senate, or even to obtain just more seats than PdL. PD would need a lead of 5% or more to be likely to get the majority. Which looks like a long shot indeed.

However, the simulation I did follows a strict *uniform national swing* hypothesis. A re-distribution of the votes between regions different from 2006 (as by the way showed in the few published regional polls) could dramatically alter the results, in a way that is difficult to predict. The number of seats are indeed more sensitive to the distribution of votes among the regions than to their global number. I was surprised to find, for instance, that by shifting around less than 100000 votes, it was easy to “give” the senate majority to PD (158 seats) with only a 1% lead with respect to PdL. Thus, even for the Senate, a small possibility for a PD majority do exist.

I have so far talked about statistical possibilities. One must add a “political” dimension in order to make an educated guess on the outcome. For instance, most commentators (and a few polls) maintain that most of the undecided who will finally vote will favor PD rather than PdL (opposite from 2006). Also, the 2008 polls show a Center-left to Center-right ratio more unbalanced towards the latter than ever before. Since in paste elections the movements between the two block were minimal, this could imply a poll bias in favor of PdL.

For these reasons I would expect a reduced PdL-PD gap in the end. If I had to bet, I would place my money on a PdL victory with a 2-4% margin at the lower chamber, and an essential draw in Senate seats. But nothing is excluded, and Veltroni does well to keep up the pressure.

**Yes we can** sounds too optimistic to me, since the bar is indeed set very high. But I would’t object to a **Yes we could**.

April 11, 2008 at 2:36 pm

Excellent analysis, as usual.

April 13, 2008 at 7:02 pm

I second Caminadella’s verdict! Maybe to reduce the strength of the UNS assumption, we could add some jitter to the swing in each region – a kind of sensitivity testing? Unfortunately, it’s really hard to calculate decent priors for the magnitude of variance in swing across region – largely because Italy keeps changing electoral systems so often!

Callegaro/Gasperoni response was misleading when it was published – rather embarassingly, I’d miscalculated the variance. I’m now waiting on their own data, but the results seem to confirm their analysis – and, consequently, the hypothesis that there was little movement over the last fifteen days, but much pollster error.

April 14, 2008 at 11:43 am

Thanks to both.

Chris,

first, yes, I thought to try something of the sort, but I gave up for exactly the reason you bring – no precedents. Both electoral system and party/alliance configuration changes every time. This will probably impact a lot the polls as well.

For the second, I can only quote what a Nobel laureate said recently (reported elsewhere in this blog):

“I have published more than a few papers that haveturned out to have been wrong. So have most of my colleagues. That’s the name of the game!”

and

“Even our greatest heroes, Galileo, Newton and Einstein, have published speculations that turned out to be quite false.”

😉