Politics, elections and piffle plinking

Pollster House Effects and the ALP

If you were to see a chart like this, what would pop into your mind? (Now that you’ve read the title of the post, you probably don’t need to buy a vowel here)

That is the ALP primary vote by pollster for every poll taken this year – and while the polls certainly move together, there’s a consistency in the pollster spread that is, statistically, something other than random behaviour.

If we clean that up a bit, identify the pollsters and run a local regression through the data to get a line of best fit – it all becomes a little clearer.

We can do the same for the ALP two party preferred vote as well.

Here the phone polls have solid black markers and the non-phone polls (face to face for Morgan and the online panel for Essential Report) are hollow.

Nielsen and Newspoll generally track below the regression line, Morgan phone poll doesn’t seem to have a pattern, while Essential Report and Morgan Face to Face generally track above it.

If we create a local regression line of best fit for each pollster and compare them, the house effects stand out clearly.

Note that Nielsen doesn’t yet have enough data points to run any meaningful smoothing algorithm so we’re just using the raw Nielsen numbers here. Also, the Morgan Phone poll doesn’t poll enough anymore to make the data useable here.

The problem with so called House Effects in Australian polling is that they are usually mid term phenomena where large variation in polling results occurs only between elections. In the days leading up to election day, the polls generally start to cluster around a given value with the actual election results being within the margin of error of every pollster’s final poll of the campaign. Although every year or two one of the pollsters gets it wrong – but none of them habitually get it wrong, which is a rather important thing to remember.

Our problem becomes one of determining which pollster or pollsters are correct and which one’s are not.

Unfortunately we don’t know, there is no way to find out and anyone that tries to tell you that Pollster A is more accurate than Pollster B between elections is simply making shit up – usually to suit their own political preferences.

Assuming that our pollsters aren’t making grave errors in their sampling frame construction or their demographic weightings (which we never know about anyway because Australian pollsters, unlike their US brethren, are a bunch of secret squirrels when it comes to discussing their respective methodologies), the only evidence we have to compare pollsters nationally comes once every three years.

And on that comparison they’re all pretty good.

So instead of looking at House Effects as bias – bias being something that can never be substantiated with the available evidence – all we can do is acknowledge that we know what we don’t know and go from there. So after that Rumsfeldian Eureka moment we’re just left with the data and what it can tell us.

If you look closely at the charts above you’ll notice that polling trends are caused by, usually, at least 3 out of four pollsters moving in the same direction at the same. So even though the levels of the pollsters might be different, even though the base primary vote and TPP vote numbers might be different, the polls trend together – and it’s that trend that is ultimately important.

That enables us to tell when changes in direction occur, it let’s us tell when trends have ceased or when polling numbers are flat lining. It can tell us something fairly significant about changes in levels of political support.

And while the poll results collectively may add uncertainty to the headline numbers for each pollster, since we have no evidence on which pollster is right or wrong, we can simply let the data speak for itself knowing that if the polling future is like polling history, come election time the different pollsters will start to converge.

So in the meantime, the least worst and most theoretically accurate way to use all of the polling information we have to try and identify any trending behaviour is to aggregate the polling data – which is where our Pollytrack and LOESS regression series come into play.

I haven’t done the same analysis for Coalition polling data yet except for this lonesome chart:

Give me a yell if you’re interested in seeing the regression lines for each pollster with the Coalition primary vote.

9 Comments

  1. 1
    Posted October 30, 2008 at 8:50 pm | Permalink

    Poss – what’s the chance of actually posting the raw data? Or do we have to assemble it ourselves from primary sources (copyright?).

  2. 2
    castle
    Posted October 31, 2008 at 7:27 am | Permalink

    Labor still struggling to crack 58 on Pollytrack though Poss.

    Maybe its time for some dramatic action, invade the Kimberleys, sink a refo boat or two or arrest Turnbull for impersonating Latham.

  3. 3
    Alan Kennedy
    Posted October 31, 2008 at 8:29 am | Permalink

    Poss, when I see a chart like that only panic pops into my mind and I take your word for it.

  4. 4
    steve
    Posted October 31, 2008 at 9:35 am | Permalink

    AK, funny that you should mention panic. The coalition had better hope that that is not a classic head and shoulders pattern. If it is then expect a big drift down in that blue line.

    http://www.shareselect.com.au/knowledge-bank/Technical-Analysis/Head-and-Shoulders-Pattern/

  5. 5
    caf
    Posted October 31, 2008 at 10:25 am | Permalink

    The House Effects do seem to be “bias”, at least in the pure mathematical / engineering sense of the word (where it has no negative connotations).

    What’s it look like if you plot each poll’s difference from the all-polls-LOESS at the time the poll was taken, with lines of best fit for each pollster? Are the biases approximately constant for a given pollster?

  6. 6
    Ad astra
    Posted October 31, 2008 at 10:50 am | Permalink

    A nice myth-busting piece Poss. I hope commentators will now avoid attributing individual poll results to bias or methodological flaws.

  7. 7
    Alan Kennedy
    Posted October 31, 2008 at 11:21 am | Permalink

    Tks Steve. My panic isn’t to do with the way the lines are going although it looks like Poss is saying the Labor mob are tracking better than Turnbull’s mob.
    No, the panic is that stats are not my strong suit and Poss is one of the great stat wonks. when I read…”If you were to see a chart like this, what would pop into your mind? (Now that you’ve read the title of the post, you probably don’t need to buy a vowel here)….” all that comes into my head is WTF
    He needs a what this all means at the end for us mere mortals.

  8. 8
    steve
    Posted October 31, 2008 at 11:42 am | Permalink

    AK, the best way to learn about this stuff is just throw yourself in at the deep end. A bit of time and experience still baffles the best of them, the US Federal Reserve is still unsure how to handle the ‘TED spread’. It does nothing but watch charts all day.

    http://www.nytimes.com/interactive/2008/10/08/business/economy/20081008-credit-chart-graphic.html

  9. 9
    Labor Outsider
    Posted November 3, 2008 at 10:36 pm | Permalink

    Hi Possum

    I found this analysis quite interesting. However, I’d also like to see some suplementary analysis as well. Do you think you could put together a table with the following information for each pollster for which data is available:

    The average difference in estimated ALP primary vote between that pollster and the other pollsters between each election campaign. The average during each campaign. The difference between each pollster and the final result in their last poll before election day.

    Some other observations. It is interesting that the Essential Report is the only poll that crosses the average for all polls.

    If you think that there is no way to discriminate between the polls between election campaigns (I find your analysis reasonably convincing here), why is that insiders pay so much more attention to Newspoll than the others?

    Third, the fact that there are such large differences between the polls between elections, and that the differences are persistent – suggests that at least one of the polls is making grave errors between elections – we just don’t know which it is.

    Fourth, trends matter but so do levels – insiders pay attention to both because it is the level and not the trend that determines who wins an election. There is a big difference between falling 5 percentage points on the 2pp from 58 to 53 and 52 to 47. Thus, when making decisions in between elections, parties have to take a view on what the best estimate of the level is. Your analysis suggests the average of all (with equal weights) is best. I know that fivethirtyeight applies different weights to different polls – is there methodology relevant for Australia?

    Finally, while I appreciate seeing you pit your psephological wits against the MSM, once the US election is over, is there enough Australian polling data to justify devoting an entire blog to it? I’d like to see you use your analytical skills to examine some policy questions from a non-ideological perspective as well. Arguably it is more important than psephology and in my view is the biggest gap in the blogosphere.

    Cheers

Post a Comment

You must be logged in to post a comment.