Friday, July 17, 2015

Political Science: The Best & Brightest

Look at who is the latest political science major:

Yes people, that's  Anders Breivik, newly matriculated at the University of Oslo to major in......

.....Political Science!!!

The future of the discipline is in good hands.

*****Bonus Snark********

Do you know what Norway gave him for killing 77 people?

21 years.

That's a little over  3 months per victim.

Wednesday, July 15, 2015

Who cares about getting an asymptotically correct standard error for a possibly massively inefficient coefficient estimate?

Well sure,  Angrist and Plishke and their legion of devotees, but that barn has already been burned by my main man, Edward Leamer (see the section entitled "White-washing).

But now Cameron & Miller take up the cudgel to argue for OLS plus "cluster-robust" standard errors.

Let's review:

1. Under Heteroskedasticity, OLS is unbiased but inefficient (has a variance larger than the minimum variance unbiased estimator) and normal OLS standard errors are biased.

2. The robust standard errors crowd ignores the first issue to focus on the second, but do not produce such a great answer even to the problem of the OLS standard errors, because their robust standard errors have only an asymptotic justification (i.e. the only property that can be shown for them is consistency).

3. Leamer has argued that the first problem is the more important problem. Since some forms of heteroskedasticity make OLS extremely inefficient, we need to find a better estimator. These are often called FGLS (feasibly generalized least squares) estimators, where the researcher estimates a model of the conditional variance along with the model for the conditional mean.

4. Here is where the robust standard errors folks raise the bogeyman of bias in their argument against  looking for a better estimator.

Here's the money quote from Cameron & Miller:

"One way to control for clustered errors in a linear regression model is to additionally specify a model for the within-cluster error correlation, consistently estimate the parameters of this error correlation model, and then estimate the original model by feasible generalized least squares (FGLS) rather than ordinary least squares (OLS). ... If all goes well this provides valid statistical inference, as well as estimates of the parameters of the original regression model that are more efficient than OLS. However, these desirable properties hold only under the very strong assumption that the model for within-cluster error correlation is correctly specified."

Look people, we have two enemies when we try to get a point estimate of an unknown parameter, variance and bias.

Suppose you don't have the exactly correct functional form of the conditional variance but you do FGLS anyway. Say you create 10% bias by doing so. It is still the case that the reduction in the variance may well be sufficient to accept that increased bias and use the mis-specified FGLS estimator instead of the least squares estimator.

In my own research work on GARCH models with Rodolfo Cermeno, we show that even incorrectly specifying the conditional variance often produces coefficient estimates superior on mean squared error grounds to OLS.

This happens because of the extreme inefficiency of OLS in the face of some forms of heteroskedasticity and because small mis-specifications of the conditional variance model do not seem to lead to large biases in the estimates of the conditional mean parameters.

Furthermore, you actually can perform some statistical tests to see how well your chosen model of the conditional variance is working. Simply use the standardized FGLS errors and test them for general forms of heteroskedasticity and see what happens. If we fail to reject the null of no heteroskedasticity at say the .25 level, we are pretty confident in our functional form.

Finally, since all we know about robust standard errors is that they are consistent, in a finite sample the OLS + robust errors approach can give us a very inefficient parameter estimate and a biased standard error for that parameter estimate.

This problem is even greater in the case of clustered residuals because the requirement becomes that the number of clusters goes to infinity, not just the number of observations!

Double finally, let me note (and Camerer & Miller do a good job of explaining this), FGLS and clustered standard errors are not mutually exclusive. You can do both. My recent paper with Dan Hicks and Weici Yuan is an example.

Mrs. Angus has just informed me that this piece should be titled, "The Nerdiest Rant Ever".

But people, variance is just as big a problem as bias and consistency alone is a weak reed on which to base your estimation strategy.

Mrs. Angus has just informed me that I've just proven her point with the previous sentence.

Monday, July 13, 2015

Fair is Fair, But It Depends on Where

Fair Is Not Fair Everywhere 
Marie Schäfer, Daniel Haun & Michael Tomasello
Psychological Science, forthcoming

Abstract: Distributing the spoils of a joint enterprise on the basis of work contribution or relative productivity seems natural to the modern Western mind. But such notions of merit-based distributive justice may be culturally constructed norms that vary with the social and economic structure of a group. In the present research, we showed that children from three different cultures have very different ideas about distributive justice. Whereas children from a modern Western society distributed the spoils of a joint enterprise precisely in proportion to productivity, children from a gerontocratic pastoralist society in Africa did not take merit into account at all. Children from a partially hunter-gatherer, egalitarian African culture distributed the spoils more equally than did the other two cultures, with merit playing only a limited role. This pattern of results suggests that some basic notions of distributive justice are not universal intuitions of the human species but rather culturally constructed behavioral norms.