The purpose of this assignment is to learn how to use and interpret regression analysis and to predict outcomes using regression. Also, the goal of this assignment is to learn how to map standardized residuals and to connect statistics to a spatial output. This assignment is split up into two parts, the first part looks at the relationship between the percent of kids that get free lunches and the crime rate per 100000 in a community. The second part of this assignment looks at the responses to 911 calls in Portland OR.
Part One:
For part one, a new station made a claim that the number of kids that receive a free lunch increases, the crime rate also goes up. The goal for part one is to figure out if the news is correct or false using the data provided and SPSS. The output for the regression analysis from SPSS for the crime data is below in figure 1. Using the outputs for the regression analysis, a lot of information can be derived
| figure 1. The results from the SPSS regression analysis for the crime data provided |
Part Two:
Introduction:
For second part of this assignment, 911 calls are compared to other variables in Portland OR. This is done with the data provided, as well as SPSS and Arcmap. For this, three variables are compared to 911 calls individually, the variable with the highest r^2 value will have its residuals mapped, and finally multiple variables are compared to 911 calls at once with multiple regression analysis and multiple regression analysis with a step wise approach.
Methods:
The first step of the second part of this assignment is to run independent regression analysis on different variables and 911 calls. For all of these, the dependent variable is the 911 calls and the independent variables are the ones that are selected to compare to 911 calls. This can done by opening SPSS and opening the data in the program. Next, go to the analyze tab and select regression/linear. Then set the dependent variable to calls and the independent variable to what the calls is being compared to. After this is processed it will give an output that describes the regression between calls and the other selected variables.
Next, to make a cloropleth map of the number of 911 calls per census tract, Arcmap needs to be opened and the Portland census tracts layer needs to be added. From here a simple symbology change will result in a cloropleth map of the 911 calls. To map the residuals, the toolbox needs to be opened and the spatial statistics need to be navigated to. Next, select modeling spatial relationships and choose ordinary least squares. Once this tool is opened, select the census tracts as an input and set the unique field id to UniqID and set the dependent variable to calls and the explanatory variable to the independent variable. This will result in a new layer that shows the residuals for each census tract in terms of calls and the variable chosen.
The last steps for this assignment dealt with multiple regression analysis. For this it is the same as an individual regression analysis, but multiple independent variables are chosen instead of one. Getting the result from this will show the regression for all the independent variables together. However, a step wise approach is needed to only choose the variables that work well with the data. This can be done by selecting step wise under the methods drop down in the linear regression tool window. This will give an output with only variables that help increase the r^2.
Results:
For the first step of part two, variables within census tracts were compared to the amount of 911 calls in the census tract. The variables that were selected to try to explain the number of 911 per census tract are the number of people with no high school degree, the unemployment rate, and the population density. The first individual regression analysis that tried to explain 911 calls is the number of people with no high school degree. The results for this regression analysis can be seen below in figure 2.
| figure 2. Regression analysis output for 911 calls and low education population |
The second variable that was selected to try to explain the amount of 911 calls is unemployment rate. The regression analysis output can be seen below in figure 3. From this output, an equation for a best
| figure 3. Regression analysis output for 911 calls and unemployment rates. |
The third variable that was selected to try to explain the amount of 911 calls is the population density. The regression output can be seen below in figure 4. Looking at this output a equation for a best fit
| figure 4. Regression analysis output for 911 calls and population densisty. |
The second step of part two maps the amount of 911 calls per census tract and the residuals for the variable that has the highest r^2 value, uneducated. The first map simply shows the total number of 911 calls per census tract in Portland. This map can be seen below in figure 5. From this, it is easy to
![]() |
| figure 5. Number of 911 calls per census tract in Portland OR |
![]() |
| figure 6. Map of residuals from regression analysis for 911 calls and amount of uneducated people |
The third step of part two deals with multiple regression. For this part a multiple regression analysis is preformed on the data. The number of 911 calls will remain the dependent variable but multiple independent variables will used as the input. The output for the multiple regression analysis can be seen below in figure 7 and figure 8. These two figures show the multiple regression analysis for this
| figure 7. Output for multiple regression analysis part 1 |
| figure 8. Output for multiple regression analysis part 2 |
| figure 8. Output for step wise regression part 1 |
| figure 9. Output for step wise regression part 2 |
| figure 10. output for step wise regression part 3 |
| figure 11. Output for step wise regression part 4 |
equation the most. The three variables that help drive the equation the most were Renters, LowEduc, and Jobs. With these three variables together, the r^2 value is .771. This is a pretty high r^2 value and means that these three variables explain 77.1% of the 911 calls. Also, the equation for this output is 911 calls = renters*.024+LowEduc*.103+Jobs*.004. All of these variables have a positive slope so they all have a positive relationship with the amount of 911 calls. Looking at the beta values, the variables can be ranked from most influential to least influential; LowEduc, Jobs, Renters. Using this data, the residuals can be mapped. This can be seen below in figure 12. This map shows areas in the
![]() |
| figure 12. Mapped residuals for Portland OR |
Conclusion:
This part of the assignment helps explain regression, residuals, and multiple regression with Portland OR as the example. The goal of this assignment was to figure out locations that a new hospital should be build. From looking at the regression analysis and the map of the residuals, the location of a new hospital should be right in the middle of these census tracts. This area is under represented in the amount of 911 the model says they should have. That means that there are more 911 calls in these areas than the model shows. The areas in the middle of the census tracts show the greatest under representation so the hospital should go there instead of somewhere that is over represented by the model like the outside census tracts.



No comments:
Post a Comment