Specification errors of an econometric model refer to the different errors that can be made when selecting and treating a set of independent variables to explain a dependent variable.
When a model is constructed, it has to meet the correct specification hypothesis. This is based on the fact that the explanatory variables selected for the model are those that are able to explain the independent variable. Therefore, it is assumed that there is no independent variable (x) that can explain the independent variable (y) and that in this way the variables that allow the correct model to be approached would have been chosen.
Model Specification Errors
There are a number of errors in the specification of the model that could be grouped into three large groups:
Group 1: The form works is not specified correctly
- Omission of relevant variables:Imagine that we want to explain the return of the shares of company Y. For this we select as independent variables PER , stock market capitalization and book value. If the floating capital (free float) is correlated with any of the variables contained in the model, the error of our model would be correlated with the variables included in the model. This would cause the parameters estimated by the model to be unbiased and inconsistent. Therefore, the results of the predictions and the different tests performed on the model would not be valid.
- Variables to be transformed:The regression model hypothesis assumes that the dependent variable is linearly related to the independent variables. However, in many cases the relationship between them is not linear . If the necessary transformation in the independent variable is not done, the model will not have the correct adjustment. As examples of transformation of independent variables we have the taking of logarithms, the square root or square squared among others.
- Poor collection of the sampledata : The data of the independent variables must be consistent with time, that is, there can be no structural changes of the independent variables. Imagine we want to explain the variation in GDPin country X using consumption and investment as independent variables. Suppose an oil field is discovered in that country on state lands and the government decides to abolish taxes. This could mean a change in the consumption habits of the country that as of that date was maintained indefinitely over time. In this case we should collect two different time series and estimate two models. One model before the change and another after. If we grouped the data into a single sample and estimated a model, we would have a poorly specified model and the hypotheses, contrasts and predictions would be incorrect.
Group 2: The independent variables are correlated with the term of error in time series
- Use of the dependent variable with delay as an independent variable: Touse a variable with a delay is to use the data of that same variable but measured a previous period. Suppose we are using the previous GDP model as a dependent variable. Let’s add to the model, in addition to consumption and investment, the GDP of the previous year (GDP t-1 ). If the GDP of the previous year is serially correlated with the error, the estimated coefficients would be biased and not inconsistent. This would again invalidate all hypothesis contrasts, predictions etc.
- Predicting the past:When we measure a variable, we always have to take the period before we want to estimate. Suppose that our dependent variable is the returns of action X and our independent variable is PER. Let’s also assume that we are taking the final data for February. If we use this in our model, we will conclude that the action with the highest PER at the end of February was the one with the highest returns at the end of February. The correct specification of the model involves taking the data from the beginning of the period to predict the subsequent data and not vice versa as in the previous case. This is called predicting the past.
- Measure the independent variable with error:Suppose our independent variable is the return of an action and one of our independent variables is the nominal interest rate. Recall that the nominal interest rate is the interest rate plus inflation. As the inflation component of the nominal interest rate is not observable in the future, we would be measuring the variable with error. To correctly measure the interest rate, we would have to use the expected interest rate and that this would take into account the expected inflation and not the current one.