In this demonstration, we examine the consequences of heteroskedasticity, find ways to detect it, and see how we can correct for heteroskedasticity using regression with robust standard errors and weighted least squares regression.
As mentioned previously, heteroskedasticity occurs when the variance for all observations in a data set are not the same. Conversely, when the variance for all observations are equal, we call that homoskedasticity. Why should we care about heteroskedasticity? In the presence of heteroskedasticity, there are two main consequences on the least squares estimators:.
Most real world data will probably be heteroskedastic. However, one can still use ordinary least squares without correcting for heteroskedasticity because if the sample size is large enough, the variance of the least squares estimator may still be sufficiently small to obtain precise estimates.
If there is an evident pattern in the plot, then heteroskedasticity is present. There seems to be no evident pattern. A more formal, mathematical way of detecting heteroskedasticity is what is known as the Breusch-Pagan test. A more general form of the variance function is. Recall that we are testing the following null and alternative hypotheses.
From the above equation, we can write. We can then write the rewrite the above equation as. That is,. There are two ways we can conduct the Breusch-Pagan test in R; the easy way and the hard way. We will test. Since 7. It involves using the lmtest package and calling the bptest function on our fitted model. This is how we do it shout out to Montell Jordan :.
Since 0. Recall that the two main consequences of heteroskedasticity are 1 ordinary least squares no longer produces the best estimators and 2 standard errors computed using least squares can be incorrect and misleading. We do this by using heteroskedasticity-consistent standard errors or simply robust standard errors. It tests whether the variance of the errors from a regression is dependent on the values of the independent variables. You can perform the test using the fitted values of the model, the predictors in the model and a subset of the independent variables.
It includes options to perform multiple tests and p value adjustments. Test for heteroskedasticity under the assumption that the errors are independent and identically distributed i. F Test for heteroskedasticity under the assumption that the errors are independent and identically distributed i. It is needed to ensure that the estimates are accurate, that the prediction limits for the dependent variable are valid, and that confidence intervals and p-values for the parameters are valid.
Unconditional heteroskedasticity is predictable and can relate to variables that are cyclical by nature. This can include higher retail sales reported during the traditional holiday shopping period or the increase in air conditioner repair calls during warmer months.
Changes within the variance can be tied directly to the occurrence of particular events or predictive markers if the shifts are not traditionally seasonal.
This can be related to an increase in smartphone sales with the release of a new model as the activity is cyclical based on the event but not necessarily determined by the season. Heteroskedasticity can also relate to cases where the data approach a boundary—where the variance must necessarily be smaller because of the boundary's restricting the range of the data. Conditional heteroskedasticity is not predictable by nature.
There is no telltale sign that leads analysts to believe data will become more or less scattered at any point in time. Often, financial products are considered subject to conditional heteroskedasticity as not all changes can be attributed to specific events or seasonal changes. A common application of conditional heteroskedasticity is to stock markets, where the volatility today is strongly related to volatility yesterday.
This model explains periods of persistent high volatility and low volatility. Heteroskedasticity is an important concept in regression modeling, and in the investment world, regression models are used to explain the performance of securities and investment portfolios.
The most well-known of these is the Capital Asset Pricing Model CAPM , which explains the performance of a stock in terms of its volatility relative to the market as a whole.
Extensions of this model have added other predictor variables such as size, momentum, quality, and style value versus growth. These predictor variables have been added because they explain or account for variance in the dependent variable. Portfolio performance is explained by CAPM. For example, developers of the CAPM model were aware that their model failed to explain an interesting anomaly: high-quality stocks, which were less volatile than low-quality stocks, tended to perform better than the CAPM model predicted.
CAPM says that higher-risk stocks should outperform lower-risk stocks. In other words, high-volatility stocks should beat lower-volatility stocks. But high-quality stocks, which are less volatile, tended to perform better than predicted by CAPM. Later, other researchers extended the CAPM model which had already been extended to include other predictor variables such as size, style, and momentum to include quality as an additional predictor variable, also known as a "factor.
These models, known as multi-factor models , form the basis of factor investing and smart beta. Risk Management. Technical Analysis Basic Education. Advanced Technical Analysis Concepts.
Financial Analysis. Your Privacy Rights.
0コメント