I usually try and stay away from the Senate – primarily because it’s a major cause of psephological brainhurt, but also because there are such enormous amounts of uncertainty involved in which party gets the final 1 or 2 spots in each state, an uncertainty often tied up with micro-party preference deals, resources deployed at polling booths for distributing How To Vote cards and split voting behaviour, that trying to estimate the exact future make-up of the Senate borders on an exercise in futility. Especially since the final couple of spots can be decided by literally handfuls of votes.
Rather than going down that particular prediction route, instead it’s worth taking a look at the relationship between the primary vote that the Coalition, the ALP and the Greens receive in the lower house and the vote they receive in the Senate.
At its most basic level, the relationship between the primary and Senate votes is extraordinarily linear and very, very strong. If we run a chart where we plot the primary vote that each of these three parties received at the 2007 election in every one of the 150 electorates on one axis, and plot the vote they received in the Senate in those electorates on the other axis, we come up with a pretty strong correlation – so we’ll run a regression line through it to point out the obvious.
The three big outliers you see in the top left of the chart are Independent won seats of Kennedy and New England, plus the ex-independent held seat of Calare (which is interesting enough in itself).
Nationally, for every 1% increase in the primary vote that a party receives in the House of Representatives, that produces, on average, an increase in the Senate vote for that party of 0.87% . Hypothetically, if a party scored an additional 10 points on the primary vote, they would expect to receive an additional 8.7 points in the Senate.
97.1% of the variation in the vote of these three parties in the Senate can be explained by the variation in their primary vote in the House of Reps.
However, there are slight differences if we further break down these correlations by State and by Party.
If we run that same chart above, but this time expand the results by party, we get: (click to expand)
What turns up here is that firstly, the linear relationship holds strongly for all three parties. Secondly, the Greens get more bang for buck in the Senate by expanding their primary vote in the lower House than do the major parties – which can be seen by the steeper regression line for the Greens. It’s also worth noting that at the national level, the relationship between the Reps vote and the Senate vote for the major parties is almost identical.
If we look at the actual regression results themselves:
The way to read that table above (and the tables a little further down) is that the “Reps-Coeff” is the regression coefficient on the House of Reps variable. For every one point increase in the primary vote in the lower House, the increase in the Senate vote is, on average, this value. So with, for example, the Greens – a 1 point increase in their national primary vote would lead, on average, to a 1.25 point increase in their Senate vote. Similarly, if the ALP increased their primary vote by a point, they’d expect to get a 0.81 increase in their Senate vote while the Coalition would expect a 0.79 point increase for every 1 point gain in their primary.
The R-Square is a measure of the explanatory power of the relationship. This value tells us how much of the variation in the Senate vote, by electorate, can be explained by the variation in the House of Reps vote – again by electorate. For example, the variation in the size of the primary vote that the Coalition received in each of the 150 electorates at the 2007 election explains 90% of the variation in the Senate vote they received.
The “Constant” isn’t really important here – I’ve just added it so that the nerdy stats types won’t have to ask for it, you know what they’re like
Of note is the overunity Senate effect that the Greens received from their primary vote in the 2007 election, where every additional point of primary vote delivered them more than a point of additional Senate vote.
If we now break down the very first chart by state rather than by party, we get something that is a little messy, but demonstrates a point.
While there is slight variation between the states in terms of the relationship between the Senate and Reps vote for parties by electorate, the direction and strength of the relationship is very similar. The odd State out here is South Australia – although the size of the relationship in SA is the same as other states (the slope of the trend line), it sits significantly below the other states. That’s the effect of Xenophon, where people voted for a party in the lower house but voted for Xenophon in the Senate, dragging the Senate/Reps vote relationship down for the three biggest parties in SA.
If we go to the regression results of the Senate/Reps relationship, which should be read as the table above, we get:
I’ve ordered the States from the strongest to weakest Senate/Rep relationship. In WA, parties, on average, get the highest increase in their Senate vote for every marginal increase in their primary vote, while Tasmania gets the smallest increase.
Yet the important thing here is how the explanatory power (the R-square) remains consistently very high, while the size of the Reps to Senate effect is pretty close as well.
Finally, we can break down these results by both State and Party. Rather than use charts (which will look pretty similar to the ones above), we’ll just go straight to the regression results as they tell the actual story.
WA and SA are marked, as the results from those states, by party, have the most uncertainty as a consequence of their relative small numbers of seats. While we can pay attention to what they’re saying, we can’t be as certain that it is true as we can of the results in NSW, Vic and Qld where there are larger numbers of seats (and larger numbers of observations). It’s also why Tasmania has been removed from the analysis – with only 5 seats in Tasmania, 5 observations does not a regression equation make.
What stands out here, assuming the same broad relationship that we saw in 2007 will hold at the 2010 election (and there’s no real reason why it shouldn’t, but more on that next week), is that the Greens most politically fruitful state for lifting their generic profile is Qld. With every additional 1 point increase in their primary vote, they should expect to see their Senate vote increase by 1.5%. For the Greens, Qld has the biggest Senate bang for the buck in terms of deploying resources to obtain a Senate spot.
The other peculiar thing about Qld is how, for the ALP and the Coalition, the increase in their respective Senate vote for every marginal increase in their primary vote is relatively small compared to other states. For every 10 point increase in the primary vote of the two major parties, the ALP would expect to only get a 6.4 point increase in their Senate vote compared to the Coalition’s 6.5.
The size of a party’s primary vote is the best predictor of their Senate vote in all States that we have. This should give us a bit of an idea about implications for the Senate when we get to polls a little closer to the election. There’s a lot of info here, with quite a lot of implications, so I’ll do my best to answer any questions you might have.


32 Comments
You really are bored this morning aren’t you?, first it’s “Nielsen is apparently no longer called “AC Nielsen” – they’ve dropped the AC.”
and now this
It’s one of those days!
heh
Possum, this is one of those models where you need to think about what your constant term means. Your slope term should be around the same value as the ratio of Senate to House vote for each party in each state, unless there is a something causing deviation that is related to the size of the party vote by seat.
The ratios of 0.64 and 0.65 you get for the major parties in Queensland indicate more desertion from the party ticket than is true, and this comes about because of the large constant term. Your relationship is saying that even when the Coalition and Labor vote is zero in a Queensland House seat, the Senate vote for each party is 11.5%. That makes no sense. There is something in your error term that relates to the relationship between House and senate vote that is causing your slope line to be flat in QLD, and that cause is in the error term.
A couple of suggestions. Exclude Kennedy and New England. Include a term for the lower house donkey vote for each party. Include a term for seats with a high House of Reps “Other” vote. In Queensland, add Pauline Hanson’s vote in each Senate contest and you’ll lose half of your constant term.
The relationship is affected by the range of choices in both houses. There is always a drift away from the major parties in the Senate simple because there are more candidate choices, but the more House choices there are, the less the drift.
Oh, and if the r-squared wasn’t above 90%, I’d be worried.
Another way is to do the party regressions nationally, but put a dummy variable in for each of the states except NSW. That will give you a national slope figure, hopefully with a small constant term, plus a constant term for each dummy variable which tells you some exogenous difference between NSW and that state.
Antony, is your capitalization of your surname a subliminal message here?
If it’s subliminal I’ve missed it.
I’m thinking party preference …
It’s my name. I can’t explain why it’s in capital letters. Maybe I typed it that way. As I said, if it’s a sublimal message, I missed it.
Three ways to explain the low pass through of reps -> senate votes in QLD for the majors.
1) Personal popularity of members in the safe electorates. In Victoria this is the Petro Georgiou effect where he got a reasonable number of votes due to his personal stance on issues from people who may well vote green/democrat/something in the senate. I don’t know if this is real or not. I’m trying to explain an overly high reps vote in safe seats (name recognition?).
2) Low rep vote in seats taken by popular independents, but normal senate vote (I think Antony already suggested removing the Rep independents seats).
3) People vote more for the greens as a party in the senate (because the vote matters) than in the reps (because the local candidate is rubbish). Maybe due to small party size and the only real chance of winning being senate seats the Greens talent is stacked in the number one senate spot in each state not the individual electorates. Polling data on the democrats (circa 80s and 90s) should show this same effect.
Antony,
The constant here can’t really have any meaningful interpretation. With the ALP and the Coalition, they’ll never have a zero vote, so effectively all the constant does in their case is level the linear trend to match the actual spread of values that we witnessed at the 2007 election. With the Greens it’s pretty much the same, although where the interpretive relevance of it is even further removed by having an unavoidable negative value.
In Qld, like in other states, we need to ignore the constant (or any meaning given to it in any case) since we’re looking at the marginal change in the Senate vote that could be expected to occur given a marginal change in the Reps vote – where we’re using the distribution of the actual, witnessed, electorate level behaviour to provide us with a linear spread of “what if” scenarios.
Remember, the Constant doesn’t have to be meaningful, especially if the set of values we’re looking at or ever expect to look at, never encompasses (nor in this case, ever approaches) zero. Here the constant is effectively just a hypothetical artifact of the regression. A little bit of fallout at hypothetical levels that we never see, that allows us to produce a more accurate trend at the values we actually do see.
If we exclude Kennedy and New England, it doesnt actually make any significant difference (They’re but 2 seats among a much larger many). We could do all those things that you mention (donkey, others etc) and if we were trying to model the 3 party vote at the fringe rather than model a broader expected relationship at expected values, it would be the best thing to do. Yet, that still wouldn’t substantially change the Senate/Reps trend at expected values – just accommodate a few spikes from the individual seats as an exercise to make the constant “neat” which isn’t actually needed *for what I’m trying to do*.
So saying, next week (the plan I was referring to in the post) I’ll be incorporating more data from previous elections to do a few other things – what you’re saying here will actually come into play then. So stop spoiling!
If we just did it nationally and used dummies for the States, we end up with exactly the same information and the same results – just produced in a different, more complicated way that makes it more difficult to explain to those folks without a stats background (who are probably already throwing stuff at me behind the quiet confines of their public service desks!
)
Actually Possum I think trying to translate an increase in reps votes to an increase in senate votes on a geographic basis isn’t the right way to do it.
It would be more informative to do the time series change for each electorate (but more time consuming). e.g. change in reps vote from 2004-2007 and consequent change in senate vote. To get a really useful set of numbers you’d need to include a series of elections plus take into exclude any one-offs (Hanson in QLD, etc..)
EP, a panel data approach would do that, which is actually what I’m doing this very moment. I haven’t done the complete spread yet, but the early results arent as good as I was hoping they’d be…. but the analysis is but young, so we’ll see how it ends up when it’s all done.
Possum, as you know, the constant incorporates all the bits of the regression that are independent of the House vote, and if you get a large positive value for it, then the interpretation of the slope you described in the second last paragraph of your post should only be applied within your range of observations.
The lowest Labor vote in QLD was 28.09% in Kennedy, where Labor’s Senate vote was 35.77%. Your formula basically says Labor’s House and Senate vote are in 1:1 ratio at around 30%, but Labor retains only 64% of each 10% increase in House vote above 30%. If you are only interpolating data in this way, the description you gave is entirely correct, but I think the way you wrote the second last par of the post makes it sound like Queensland sees a more dramatic fall off in major party support.
Just looking at the House Labor votes, the lowest value in NSW was 9.83%, VIC 21.85%, WA 20.43%, QLD 28.09% and SA 30.07%. The slope coefficient is starting from a higher value in Queensland.
Energy pedant, the boundaries change at every election in Queensland, so trying to do a comparison based on swing is very very time consuming for the House, and next to impossible for the Senate. I think possum is doing the right analysis, as it is interesting to see how that slope changes from election to election.
I did some work on this House/Senate comparison for my honours thesis based on 1980s elections. The nature of contests always complicated analysis. Phillip and Wentworth were neighbouring seats that used to attract similar Senate votes for the Democrats. However, gap in House and senate voting was always much larger in Phillip, a hotly contested marginal seat. The Democrat vote seemed to be squeezed in the House but blossom in the Senate.
Possum, I thought the most interesting aspect of your analysis is the Green vote graph. Because the Green vote goes down under 5% in some seats, I think you get a slope which is much easier to interpret in comparing House and Senate vote. Analysing the major party vote always values under 20%, so your interpetation of the slope gets affected by other matters.
With minor parties, the slope coefficient tells you a lot about how the party is inducing voters of the major parties to split their ticket between the Houses. If you did the same analysis on Democrat vote from the 80s and 90s, you would find a much steeper slope, as the Democrats would always do much better in the Senate compared to the House. The Greens are always much nearer the 45 degree line because they tend to draw on a more committed vote more likely to maintain the ticket in both houses.
Fair enough Antony – I probably could have written that better. I’m really just after -at this stage – the most likely spread of most likely values. I found the linearity of it all quite compelling for the bulk of where the values actually sit. Especially since the demographic and political spread of the electorates that form the basis of the various trends appear to be a pretty good substitute for the likely scenario of possibilities that the States as a whole could provide.
The difficulty at the fringes really only comes with the major parties – the Greens however, just have this big, powerful linearity happening everywhere. What makes it interesting here (for me at least, anyways) is Qld. On the one hand, the political science approach suggests that Qld is probably the state which would have the lowest probability of producing a Greens Senator – yet with these numbers (and pretty much the numbers of the 2004 election), it suggests that while it might be the most difficult place for them to actually get a Senator, it’s also the place that should provide the largest returns in the Senate vote as a consequence of deploying resources to lift their generic support level.
That’s pretty funny – I wrote the last post before I’d seen your latest Antony.
That reference to major party analysis should have said it lacks values under 20%.
I’m old fashioned with linear regression and election statistics possum. Always look at the outliers and see if there is some dummy variable you can include that sensibly stabilises the analysis.
In the 1980s, you got quite a few contests with only 3 candidates, Labor, Liberal and Democrat. In these seats, you would get the anomaly of the Democrat vote falling in the Senate. Once you included a dummy variable for this situation, you tilted regression line and ended up with a coefficient from the equation for the House Democrat vote that made more sense.
It’s interesting running a more sophisticated locally weighted polynomial regression through the data that accommodates outliers:
http://blogs.crikey.com.au/pollytics/files/2009/09/loessenreps.PNG
That’s the national data and it’s nearly linear, but not quite. Unfortunately, you need fairly large numbers of observations to do this – which means it’s really limited to the national results, and those of NSW, Vic and barely Qld. But it’s a good alternative way to define a trend through a seat distribution without having to worry too much about the underlying “whys”
You’re trying to provoke me with the last sentence!
Possum, Additive Models (using basis splines) are probably more appropriate than Loess for fitting a slightly more flexible regression.
Still, it’s a very interesting analysis. I’m not sure how valid it is, though, to use the Greens results (which are quite low) to analyse the same overall trend as for the major parties. I understand you’ve got the breakdown by party but I’m not sure it’s valid to assume that the slope will continue to be greater than 1 once the lower house vote rises beyond 10%.
Thanks Poss. You’re a gem.
I’ve sometimes wondered what the relationship was, and now I know.
I am especially interested in the way that the Greens graph works out. Does that mean that Greens should campaign long and hard (and expensively perhaps) to get a reps candidate to, say, double digit % figures but without any hope of winning the seat, in order to maximise the senate vote, where it will translate into seats?
Sam,
It’s an interesting quandary on the transition mechanism for the Greens between the quite rigid and linear patterns we see now and some (one would think) inevitable evolution into receiving smaller gains in the Senate for any increase in generic support in the Reps should their vote continue to grow.
So far, the linearity still holds out to around the 18-20% vote mark in the Reps. Certainly the behaviour in seats with a Reps vote less than 10% produces the same average ratio for the Senate that those seats with a Reps vote between 10 and 20% do.
It’s an interesting question though – at what level of generic support do the Greens start cease to get a premium Senate vote?
Don, I think for the Greens it’s not so much campaigning in a particular Reps seat to boost the Senate vote, but campaigning across a whole State’s worth of Reps seat to boost the Senate vote.
Some good candidate in a typical inner city seat where they might be in with an outside chance to win, could certainly play a big role in boosting generic Green media exposure during a campaign – but it really needs to be a larger State wide campaign to lift generic support (since the Senate vote is itself state wide).
For the Greens, Qld has the biggest Senate bang for the buck in terms of deploying resources to obtain a Senate spot.
I’m not sure how you’ve determined which is the horse and which is the cart in this relationship. Would more campaigning keep the ratio near 1:1.5 and result in an increase in the Senate vote? Or would it tend to increase the Reps vote instead, and bring the ratio down nearer to 1:1?
Also, if you want a lot more observations so that you can include Tasmania and have more confidence in the WA and SA results, you could sample at the booth level instead of the electorate level.
Cool analysis. Interpreting it back into campaigning strategy is not all that straightforward though because of causation. One would expect that similar factors drive both the Reps and the Senate vote and so there’ll be endogeneity in the regression, so it can’t really be interpreted as “an increase in the reps vote will lead to an X% increase in the Senate vote”. If the data existed it’d be great to add in some kind of measure of campaign strength for each party in each house and senate vote. That might allow more comment to be made on causation.
On using polling booth results – the way the data is published at the AEC makes it a extremely tedious to do that. The data is effectively raw, meaning that the ‘above the line’ and ‘below the line’ results would need to be aggregated manually for each booth.
ick….
On which is the horse and which is the cart – it’s probably neither, but just the way generic support gets distributed between the Reps and the Senate.
It’s a marvelous thing: Watching two professional Pros. None of this ad hominem stuffing around, cool, cool logic and expert analysis. Great stuff, Poss.
I just can’t see a Green ever making it into the Senate in QLD. They’re not called the Deep North for nothing. If you should prove me wrong I’ll stand you a bottle of French Champagne, or what about the new boutique beer which is hideously expensive. I can arrange to pay for it via Paypal.
I really enjoyed this analysis and the discussion below has also been interesting.
Clearly, there needs to be some talk and addition here about what this means. It would be opinion and conjecture but I’d love have a post on what you would do with this infomation if you were to structure the greens, labor, liberal and national campaign strategy.
I think against what Venise says above the greens with Larissa Waters were actually closer than ever to getting a Sentate Seat in QLD, though it’s difficult me to say how close… the below seems to say about another 2.5% is needed:
http://www.abc.net.au/elections/federal/2007/results/senate/qld.htm
However, the problem there is that basically, the short term goal (if not the longer one involving world domination) for the greens should be to hold the balance of power in the senate. If they win this seat in QLD then they’d nab it off Labor. So who cares! Really they want to nab a seat of an independant or the coalition… or more likely hope that no other independants get elected.
It seems like if you are the Greens you basically just cross your fingers and hope!
What a stupid situation when 6 greens senators have the same importance as Mr Fiskal, and the X man!
You might be interested in how I’ve played with the Green figures and compared them to support for the Australian Democrats in 1996.
http://blogs.abc.net.au/antonygreen/2009/10/comparing-the-greens-and-australian-democrats.html