In my last post I displayed a plot showing a striking correlation for single-motherhood rates and out-of-school suspension rates between racial/ethnic groups using national averages.

nat_by_sm

I am well aware that aggregating linearly correlated variables will tend to produce (much) stronger correlations than you’d see with more granular data (e.g., state, county, family, individual, etc).  On the other hand, I am familiar enough with these statistics to know that you will see substantially weaker correlations here with other common predictors.  Hispanics/latinos, for instance, tend to be worse off than than blacks by many economic measures, rarely appreciably better off, and yet their discipline problems are much less (even, interestingly, less than whites in California controlling for median family income).  Likewise, the distance between asians and non-hispanic whites tends to be modest on economic dimensions, but their suspension rates are roughly half the non-hispanic white average.

For the benefit of others, I decided to generate some plots of predictors aggregated at a national level for comparison’s sake (note: I reversed the x-axis to keep the graphic relationship the same where necessary).


Economic measures

nat_by_per_capita_income

nat_by_median_family_income

nat_child_pov_rates

Parent education levels

nat_by_ed_less_than_hs

nat_by_ed_bach_plus

NAEP test scores of (recent) childrennat_by_naep_math_8

nat_by_naep_reading_8naep_math_12

naep_reading_12


That the single-motherhood rate would be an appreciably stronger predictor of ethnic differences when aggregated nationally should not be surprising given the stronger correlations I found at the district level in my prior post, but it’s helpful to get a “10,000 foot view” of the national data to understand the relative magnitude of these differences and how much they are likely to explain.

There is measurement and estimation error in the local ACS estimates and some amount of noise and systematic variation in district level school suspensions.  There is also something to be said for spill over-effects and central tendencies of groups.  Our social norms are informed by our neighbors, our peer groups, our preferred popular culture, and the like and that much of this occurs within major ethnic groups for a variety of reasons.  Thus when we do multi-variate regressions on individual or more granular local data these collective influences are likely to go unobserved to some significant degree.  The residuals associated with race after controls may be wrongly ascribed to racism or even, to the extent these are truly cultural/social capital differences, heritable differences themselves.

To this end, it’s worth pointing out that I also find fairly strong correlations between these data at the state level, both within and between major racial groups for single-parent family rates (note: this state data is for single-parents instead of single-mothers….but they’re pretty well correlated)

RStudio

RStudio

RStudio

That it predicts large differences in non-hispanic white suspension rates between states (in addition to between districts, as I demonstrated earlier) suggests that this probably isn’t well explained by racism or even, for that matter, classism.    Also, as I pointed out in my last post, if you extend the white linear regression line between states you end up with a pretty similar prediction for blacks as what we actually observe in their suspension rates.

(amongst states with large black populations)

Google Chrome

Now, to be clear, it’s certainly possible that these correlations are driven in large part by unobserved endogenous factors.  I will not pretend like this question can be settled here.  My point here is mainly that this is an excellent predictor at multiple levels of analysis and that it’s not easily explained away as simply being a function of the economic success of the parents, cognitive ability, or the like.


A brief discussion of heritability and related questions

For what it’s worth, it’s clear that family structure itself is not entirely heritable and certainly not immutable given relatively rapid changes in family structure over the past few decades.

Google Chrome

Although there is considerable evidence from twin studies that heritability plays a significant role too (within cohorts, within the US).

Google Chrome

It is likely that the heritability of family structure has increased in recent years due to increased variance in family structure, reduced moral suasion, increased legal freedoms, increased earnings potential of women, etc (individuals are more free to do what they want, for good or ill, those tendencies being significantly genetically influenced today).  Nevertheless, if family structure itself plays some causal role here, the fact that much of the current variance might be mostly explained by genetic factors doesn’t necessarily mean that change in family structure isn’t playing some significant role in society, especially with spill-over effects at a neighborhood level (i.e., living in a community where less than 5% of children live in unstable households may be much different than living in one where more than 70% of them are).

There is some evidence that suggests family structure plays a causal role on related child outcomes and that marriage patterns themselves are susceptible to the legal regime (probably unsurprisingly).  For instance, Jonathan Gruber investigated the impact of the introduction of unilateral divorce laws at a state level:

Google Chrome

There have also been (other) studies that suggest that marriage patterns are quite susceptible to policy changesGoogle Chrome