Continuing on from Part 1 (which you’ll really need to read before chewing on this), today we’ll look at an interesting demographic play with young families and the ALP vote at the last election.
The standard narrative has it that Labor did well with young families – those Howard battlers out in the urban boonies with their rumpus room full of rugrats, flocked to that nice man Kevin Rudd etc etc – we’ve all heard the spiel.
But the evidence supporting this isn’t as compelling as the standard story would have us to believe.
If we look at our new data and use the 0-4 year old age cohort over the period of 2006-2007 as our substitute for young families and analyse the way it plays out by electorate, we get a rather strange picture.
To start with, if we sort our 150 electorates on the basis of the proportion of 0-4 yr olds in 2007 from highest proportion to lowest proportion – in the 20 seats with the highest proportion of this cohort (from Lingiari at 8.7% through to Flynn with 7.4%), before the election Labor held 11 of these seats, after the election they held 16.
Tick one to the narrative. 15.3% of all seats changed hands to Labor at the last election, but 25% of these 20 seats did (Solomon, Leichhardt, Lindsay, Longman and Flynn) – Labor appeared to do well with young families.
Further, if we compare a stock with a stock – the proportion of 0-4yr olds by electorate against the ALP two party preferred vote and run a linear regression line through it (because it is barely different from our preferred LOESS here and is easier to get a handle on) we get:
As it was with the Coalition TPP relationship with the proportion of the 65+ cohort per electorate, so it is with the ALP TPP and the proportion of the population aged 0-4 yrs. The higher the proportion of rugrats in an electorate, on average, the higher the ALP TPP vote is at a statistically significant level – give another tick to the narrative.
But if we move on to a flow vs flow comparison, where we compare the change in the 0-4 year old population by electorate against the ALP two party preferred swing – we get something quite different ( and again, we’ll run a linear regression line through this for the same reason as before):
This tells us that the higher level of growth in that 0-4yr population over the 12 months preceding the election, the lower the average size the ALP TPP swing became.
This too is statistically significant.
It also throws a huge spanner in the works regarding the accuracy of the narrative – not necessarily because young families didn’t vote for Labor (there is probably some ecological fallacy at play here), but the impact of this demographic on the spatial distribution of the ALP swing – the only real thing that matters when it comes to targeting demographics to win seats – does pose a problem for the orthodox story.
If we break this down even further and measure this relationship using metro and non-metro seats, something even freakier comes out:
In the non-metro regions, the relationship was completely and utterly non existent, yet in the metropolitan seats is was powerful and tight and highly statistically significant.
In the cities, the higher the level of rugrat growth , the lower on average was the ALP swing, the complete opposite to the ALP relationship with the 65+ cohort.
We can bring these two demographics together – the growth in the 65+ age cohort and the growth in the 0-4 yr age cohort – via a regression or two to see how they play out in terms of explaining the variation in the ALP swing. But to do that we need to be sure these two variables don’t correlate (or it might stuff our regression results up) – and thankfully they don’t. We can see that by a simple scatter plot and linear regression line where I’ve given the correlation of the two variables.
There’s no relationship between the two (which is an interesting little aside in itself) so our regressions are safe from the evils of multicolinearity.
First up, we’ll run a regression measuring the way that changes in the 65+ and 0-4 yr age cohorts over the 2006-7 period explain the variation in the ALP swing by electorate.

This suggest that for every 1% increase in the size of the 65+ age cohort over 2006/07 in an electorate, the average swing the ALP received in that electorate increased by about 0.5%.
It also suggests that for every 1% increase in the size of the 0-4yr age cohort during the same period in an electorate, the average swing the ALP received in that electorate declined by 0.37%. These two relationships are highly statistically significant and together explain about 13% of the variation in the ALP swing by electorate.
“13% “ I hear you say, “that aint very big!” Yet, for this sort of cross-sectional data analysis it actually is pretty big.
But if you’re one of those R-squared fetishists (shame on you! – go sit in the corner!) we actually can go deeper. Remembering that these relationships individually were only significant in the metropolitan seats, if we now run the same regression but this time using just the metro seats, we get:

By isolating the analysis to just the metro seats, it not only boost the explanatory power of these two variables to nearly 25% (which is pretty extraordinary) , but also suggests that for every 1% increase in the 0-4yr age cohort in a metro electorate, the average swing to the ALP decreased by over one half of a percent (0.57). It also slightly reduces the size of the impact that the growth in 65+ age cohort had on the ALP swing from a 0.48% increase in the ALP swing for every 1% increase in the 65+ age cohort growth, down to a 0.47% increase in metro seats.
What does it all mean?
Well, for starters we might have to start rethinking our political demographics.
The vote premium the Coalition receives from the 65+ age cohort is in decline and that decline will probably accelerate as baby boomers move into that age group replacing the pre-war generation that has always voted solidly conservative. We already knew that from polling results, with the Coalition demographic train wreck being an old favourite around these parts. Because we have independent data (the polling) confirming this data here – we can be pretty confident that ecological fallacy isn’t largely responsible for the 65+ age cohort relationship with the ALP swing at the last election when aggregated by electorate – it’s merely confirming what we already knew to some extent.
The 0-4yr data here though is a little different, not only because 0-4 yr olds cant vote (although anyone that has spent time at a polling booth would surely wonder if that’s actually true, considering the behaviour of some people), not only because this age cohort is acting as a substitute in our analysis for the behaviour of young families – but because the broader demographic context that places young families in certain electorates could itself be responsible for what we can observe here with the data.
Even though the relationship between this age cohort growth and the ALP swing is stronger than with the 65+ age cohort, there could be any number of reasons for it.
There are a few worth thinking about that could explain a lot, firstly the power of the baby bonus. Did Howard perform better than expected with young families because of the cheques that rolled in once a new 0-4yr old was added to the population? Were large numbers of these families virgin mortgage holders and were swayed by the Coalition advertising on economic responsibility and influenced by the power of incumbency?
Or was something else, or some group of other things that correlated strongly with 0-4yr population growth responsible – if so, does anyone have suggestions on what they could be?
What could be problematic for the Coalition at the next election though is if the Rudd government uses the power of incumbency to reverse what happened in seats with large rugrat growth. If the Rudd focus on young families gains traction in the electorate, then the pattern of the 2007 election with this demographic could be reversed where the higher the rugrat growth, the higher the ALP swing – which would not only partially ameliorate the consequences of Rudd losing a chunk of some other demographics (although so far, we haven’t seen any evidence of that at all), but would be likely to deliver half a dozen seats in and of itself.
The Labor tracking polls would be telling them in reasonably accurate detail how they’re travelling with various cohorts – with the Rudd focus on Childcare and early education, and now with the first home owners/builders grant boost and increased family welfare outlays, as long as unemployment doesn’t go through the roof, Rudd is in a good position to nail this demographic and by the looks of the Labor policy angle, they know it and are setting out to do it. Keep your eyes peeled.
The Libs on the other hand face a bit of a problem – especially without the power of incumbency to help. In Part 1 where we were looking at the 65+ cohort, Scorpio in comments nailed it perfectly:
“I would think that the Libs are well aware of what you have identified here possum and this is why they have been behind the demonstrations of allied senior groups trying to regain that demographic.
Notice that these demo’s have taken place in metropolitan areas and have specifically been structured to gain maximum media coverage, especially TV, and to give the Opposition plentiful material to work into specifically directed questions in QT.
Also designed to get maximum coverage in evening news bulletins and current affairs type coverage. They are well aware of the potential long term loss to their voting base. I don’t know what they can do about the youth vote though. My lad, 17 & politically aware & active, tells me that amongst his fellow students that even the strong Liberal supporters can’t cop Turnbull.”







24 Comments
The parents of 0-4 yo were most likely first time voters, or at best, second timers. Therefore the scare tactics re unions and interest rates under Labor (hmmm – don’t hear those being thrown by the Libs anymore…do we?) may have had some traction, which, allied with the procreation bounties etc, may have led such tyros to vote for Rattus and Co.
Part of the late swing back was apparently due in no small part to that Coalition advertising, especially the advertising that started on the final full weekend of the campaign (as the polling started moving back to the Libs on Tuesday and Wednesday).
Maybe this was part of the demographic that shifted the polls from 56% down to 52 and 53% – it would certainly make intuitive sense.
Three things come to mind here as to why Howard had such a strong support from this demographic.
1. Baby bonus
2. First home owners grant (both for existing & new)
3. Welfare payments slanted towards middle income families.
New parents are much more likely to have more contact with their parents.
I doubt the publicly available data to investigate a correlation though.
Two thoughts here Possum.
1. Don’t you need to take into account both stocks and flows here in looking at these effects in a regression model – i.e. both where the electorate started in terms of population shares, and where they ended up? I’d imagine that larger increases in in the 0-4 (or any) population group are a higher possibility in an electorate which had a smaller share of said population group, e.g. Lingiari has 8.7% 0-4 year olds, it’s not going to grow much more (and in fact the absolute number declined by 1.4% between ‘06 and ‘07).
What I’m thinking here is that higher _growth_ in 0-4yo may be coming in electorates which were previously low in _proportions_ of 0-4yo, electorates which were the “stronghold” of the Coalition. If that’s the case, has the narrative changed?
2. Would it be possible to compare population change since 2004 (i.e the year of the previous electon) rather than 2006 – which is a much shorter timeframe for change. Or isn’t the data available?
Steve, I was thinking the exact same thing on your first point, but when I compared population vs population growth in the 0-4 cohort I got this:
http://blogs.crikey.com.au/pollytics/files/2008/12/scatter04.png
Which surprised me a little – I expected to see a slight inverse relationship, but there was no relationship at all.
On comparing against 2004, I had a look but couldn’t find the data. What made the 2006 data interesting is that when the 2006 data was taken, we still had the polls running around 50/50, so the changes in the populations we’re looking at were over the period when the polls undertook a large shift to Labor.
Also worth mentioning is that a lot of the places with low to negative 0-4 pop growth that had a relatively high Coalition vote share occurred in rural areas with relatively low overall 0-4 yr populations. That’s why I thought it would be worth breaking the lot down into metro and non-metro seats.
But that turned out to show a much stronger relationship in the metro seats overall for the ALP swing with both cohorts, which made me think that there was something more to this than merely random coincidence.
I think you need to normalise the TPP swing – the swing will be naturally higher in seats starting from a low base.
As an example, if a seat starts out on a TPP of 90/10 to the ALP, and gets a 1% swing to the ALP, that indicates 1 in 10 Coalition voters changed their preference – which is equivalent to a swing of 5% to the ALP in a seat that starts out at 50/50. A normalised figure can be calculated as the proportion of Coalition voters who swung to the ALP.
If you don’t do this, you might just be seeing that seats with a higher growth in the 0-4 cohort are those with a higher ALP vote to start with (which naturally translates into a lower absolute swing to the ALP, all else being equal).
(I’ve just read your other post and noticed that there indeed appears to be a positive correlation between ALP TPP and 0-4 cohort growth).
Caf – I had a look at that proportional application of swing issue when I started, but this election it was a smaller phenomenon than previous elections where for every 1% increase in the Coalition TPP at the 2004 election, there was an expected 0.04% (that’s one 25th of 1%) increase in the swing to the ALP this election, and only barely statistically significant at the 10% level.
Ahh – that’s actually interesting in and of itself, though the number isn’t really as small as it sounds. Nationally the swing was 5.44% to ALP off a Coalition base of 52.74%, so if the swing was uniform across Coalition voters we would expect 0.10% of swing for each 1% of Coalition TPP.
The actual figure you’ve calculated of 0.04% indicates that the kind of Coalition voters who live in electorates with lots of other Coalition voters swung proportionally less than the kind of Coalition voters who live among ALP voters.
This is extremely interesting stuff but I’d get a little less excited but what it means until undertaking some deeper analysis.
First, as you can see from the scatterplot, a very wide range of growth in the 0-4 cohort was consistent with the same ALP 2pp swing. It is still the case that in your regression the constant is more important than than the 0-4 growth rate and 65+ growth rate combined…..
Second, it appears that a small number of electorates with very high growth rates of that cohort may be strongly influencing your estimates. What happens if you drop the four electorates with growth in excess of 6 per cent?
Third, the omitted variable bias here is going to be enormous. There are all sorts of variables that might influence both voting behaviour and the growth rate of the young cohort. Disposable income growth, wealth growth, and other economic variables come to mind.
You also have a state specific factor at play. Of the 21 divisions with the largest change in the growth of 0-4 year olds, 7(!) were in Western Australia. That is 33%! Given that the swing to the ALP was smaller in WA than in other states, I suspect that the effect you have observed may be being influenced by WA specific factors. It makes sense of course that such high growth rates will have been observed in some WA electorates – families with young children were moving there in large numbers because of the economic boom.
On the 65+ growth rate. That also warrants more analysis. Growth rates of more than 4% in a single year are rather large.
All this is important because correlation does not imply causation. Once you observe a pattern you have to know why the pattern arises. Throwing a couple of variables into a regression doesn’t tell you that. You can’t get your policy settings right unless you understand the why as well as the what.
Anyway, none of what I have said takes away from your broader message – I agree that the power of incumbency will be important for labor. And I don’t think there is any doubt that the Liberals are a mess right now and need to give serious consideration to what they stand for as a party, particuarly in light of the demographic changes that are working against them.
I was very surprised to read this, because when I did a colour-coded map of swings in Melbourne I got this. This led me to presume that Labor got its biggest swings in those areas on the fringes which young families were changing from semi-rural to outer suburban, consistent with “the narrative”. However, the demographics of family growth are evidently more complicated than that, because the two Melbourne electorates that went backwards on the 0-4 age cohort were Aston and Casey, right where I might have expected the opposite to happen.
Anyway, I plugged the Melbourne data alone into the trusty XLStats application and both the correlation and R-squared amounted to two-fifths of sweet Jack diddly shit. Then I put the Sydney data in and got a scatterplot you ski down – correlation -0.66, R-squared 0.45. What do we make of that, I wonder? Sydney also didn’t have the patterns of 0-4 growth I would have expected: Lindsay, Werriwa, Mitchell and Berowra were in the negative.
Also, it seems to me that the most significant variable in explaining an electorate’s swing is usually the state it’s in, so I’ve looked at the overall 0-4 growth rate for each state. Head and shoulders above the rest on 3.6% is WA (not on my account, I hasten to add), and we all know what happened there in terms of relative swings. There probably aren’t enough seats to skew your results too much, but I still thought it might be of interest. Elsewhere: NSW 0.8%, Vic 2.0%, Qld 1.9%, SA 1.5%.
Hmmm interesting… I didn’t know that 0-4 kids had been given the vote.
More seriously tho’ David Richards you are bit out of touch here. People are spawning much later than ever before especially in those aspirational outer metro seats like Lindsay. For example my bro in law has by my reckoning voted in 7 federal general elections and has just become a daddy for the first time.
Perhaps I was, wandering seabird. It depends on the demographics of the parents of 0-4, on which I have no data. It was just a possible explanation.
I’ll throw in a bit more on parental demographics: urban parents tend to be older than rural ones. I wonder if first-time parents made the difference in your figures? They’re about 30 and didn’t have an adult life before Howard, don’t remember the last recession, and are still pretty sure that balancing work and family life is going to be easy. It would be interesting to see which way it went for the much more tired and time-poor parents of older children (and children of now-retired people), who either didn’t receive a bonus at all or got a smaller one, and who could visualise the effect of WorkChoices on their family lives as their caring responsibilities grow heavier.
LO, with this type of electorate level data, the constant is always the big thing – we’re really trying to look at a couple of percentage points of variation around the mean swing with these things.If we knock out seats with growth rates in the 0-4 cohort over 6% nothing changes much at all. The coefficient on growth reduces by 0.04 and everything stays pretty much where it was. It makes virtually no difference at all in the metro only seats.
OVB is the ultimately unavoidable problem with any regression work using electorate level data describing vote share or a change in vote share. By aggregating human behaviour like this, we can never explain (and would never really bother trying!) to explain everything. Most things (apart from State as a variable) only explain small amounts of variation in the swing.
I probably need to write a FAQ about the things I do here including all this.The purpose here wasnt to identify a causative phenomenon – (whenever it is I spell that out clearly, like the informal vote models we did a while back) but to highlight how some unknown thing or group of unkown things that are correlating with that age growth pattern, had a collective impact on the ALP swing by electorate at the last election.
Billbowe,
A fair whack of demographic assumptions that seem to be made about a number of seats arent really holding up well anymore. WIth just the 0-4 yr age growth, we might expect places like Lyndsay to keep on being veritable baby making factories – but the young families that moved to Lyndsay and produced armies of sproglets have since become a bit older. So while they still have a reasonably high proportion of 0-4yrs, the replacemeent rate of that proportion is tanking as these families slow down on the breeding but arent replaced by families that are at the begining of their reproduction curve.
A lot of the seats we used to think of as being filled with “young families” dont have families in them that are so young anymore. Same with the older Australians – there’s a few interesting patterns there as well that might need to make us have a second look at some assumptions about demographics at the next election. That relationship between 65+ growth and the ALP swing in metro seats is an interesting one because of a large number of 55-64 year olds in the country and how some of them are quite clustered in a number of seats.
If you haven’t played around with the relatively new Census CDATA online, it’s worth a few hours to get used to it. It’s a handy little tool and has convenient categories like state and commonwealth electoral divisions.
http://www.abs.gov.au/CDataOnline
Really? So including a WA dummy doesn’t do much to reduce the coefficient, or reduce the statistical significance of the coefficient?
On OVB – I wasn’t saying that everything can be controlled for – but you have census data available that allows quite a lot of things to be controlled for in the regression besides the two variables you have chosen. I’d like to see how the results stand up when you include additional educational and demographic variables…..
The relationship between 0-4yr growth and the ALP swing only holds in metro electorates, so if we stick a dummy on WA seats for the metro pool, we get this:
http://blogs.crikey.com.au/pollytics/files/2008/12/commentswa1.png
Controlling for WA has no bearing on the effect in the seats where the effect is actually at play (the metro seats).
I might have a squiz over the next few weeks at more census data in relation to this age cohort growth and the ALP swing – but with the census data by electorate, collinearity quickly becomes a bit of a pain in the arse most times. Have to be a bit careful with what you throw together in the same regression.
I certainly recognise that collinearity is a problem….but at the very least you should be able to use the census data that makes use of questions relating to income, education, debt, etc. Generally, when estimating these sorts of equations you start with the set of variables whose inclusion can be justified theoretically, and then examine whether the regression is contaminated by collinearity. If the collinearity is bad, you can always test the joint significance of the variables.
Another thing, in this sort of regression, I don’t think there is any problem with including stocks and flows in the same regression. For example, the stock of people with a university education (just using this as an example) may affect the swing if such people altered their view on the relative merits of the two parties. The flow could also matter because the change in the proportion with a university degree might reflect changes in economic conditions between the two elections.
Debt, mortgage stress, interest payments to disposable income – the whole spectrum of leverage metrics don’t tend to work very well at the electorate level because of the way these issues tend to be clustered within most electorates. Income measures behave similarly although work a little better, but you might as well use geographical distance from the capital city of each state.
I was just highlighting the need to be careful with stocks and flow comparisons. If we’re measuring the impact of an election, we need to use the swing rather than the stock, and then need to be careful that what we’re measuring as a variable is useful.
In an ideal world we could use panel data – but with more elections than census’ and with the boundaries changing significantly in some places across the intervening years, it makes it generally impossible without spending vast quantities of time reorganising old census data.