Forecasting – Multiple Regression

File Name: Forecasting – Multiple Regression

Location: Modeling Toolkit | Forecasting | Multiple Regression

Brief Description: Illustrates how to run a multiple or multivariate regression analysis

Requirements: Modeling Toolkit, Risk Simulator

This example shows how multiple regression can be run using Risk Simulator. The raw data are arranged in the Cross-Sectional Data worksheet, which contains cross-sectional data on all 50 U.S. states on the number of aggravated assaults (in thousands) per year, the number of bachelor’s degrees awarded per year, police expenditure per capita population, population size in millions, population density (person per square mile), and unemployment rate. The idea is to use multiple regression analysis to see if there is a relationship between the number of aggravated assaults per year and these explanatory variables.

Multiple Regression Analysis

To run this model, simply:

In the Cross Sectional Data worksheet, select the area C5:H55.
Select Risk Simulator | Forecasting | Multiple Regression.
Choose Aggravated Assault as the dependent variable in the regression and click on OK (Figure 87.1).

Figure 87.1: Running a regression

Note that more advanced regressions such as lag regressors, stepwise regression, and nonlinear regressions can also be run using Risk Simulator. For details on running such regressions as well as the results interpretation, refer to Modeling Risk, Third Edition (Thomson–Shore 2015) by Dr. Johnathan Mun.

Results Summary

Refer to the Report worksheet for details on the regression output. The worksheet has more details on the interpretation of specific statistical results. It provides these elements: multiple regression and analysis of variance output, including coefficients of determination; hypothesis test results (single variable t-test and multiple variable F-test); computed coefficients for each regressor; fitted chart; and much more. Also, remember to look at the Forecasting – Data Diagnostics models chapter for running analytical and econometric diagnostics such as multicollinearity, autocorrelation, micronumerosity, distributive lags, and other technical regression issues before running a regression analysis.

The dataset was set up intentionally in such a way that some of the independent variables are significant while others are not statistically significant (you can tell this from the t-test computed p-values in the report), and some of the variables are nonlinearly related to the dependent variable. Therefore, try out several variations of your model to find the best fit. Finally, do not forget to put the dataset through the data diagnostics tool in Risk Simulator. See the Forecasting – Data Diagnostics chapter for how to do this.