How many new variables need to be created so that race/ethnicity can be used in regression analyses?.
In this exercise, you will run a simultaneous multiple regression analysis to predict the women’s level of depression (scores on the CESD depression scale, cesd) based on several demographic characteristics, socioeconomic characteristics, health status, and self-reported incidence of abuse in the prior year. The list of predictors is as follows: the two race/ethnicity dummy variables you created in exercise B2; age; educatn (educational attainment); worknow (a dummy variable for currently employed); nabuse (number of different types of abuse experienced in the past year, including verbal abuse, efforts to control, threats of harm, and physical abuse); and poorhlth (a dummy coded variable indicating self-reported poor health at the time of the interview. Bring up the regression dialog box by selecting Analyze ➞ Regression ➞ Linear. Insert the variable cesd in the box labeled Dependent. Insert the 7 predictor variables that we just mentioned into the box for Independent(s). Make sure that Method is set to “Enter,” the command for entering all predictors simultaneously. Click the Statistics pushbutton and then select the following options: Estimates (under “Regression Coefficients”); Model Fit; Descriptives; and Collinearity Diagnostics. Then click Continue, and OK to run the analysis. Answer the following questions: (a) How large is the sample on which the regression analysis was run? (b) Interpret the mean value for poor health self-rating. (c) Which predictor has the highest zero-order correlation with cesd? (d) What were the values of R2 and adjusted R2 ? (e) Which predictors in the analysis were significantly predictive of the women’s depression scores, once other predictors were included? Which were not significantly predictive? (f) For this sample of women, which predictor variable appeared to be the most powerful in predicting depression? (g) Did any of the tolerance levels suggest a problem with multicollinearity?
Now you can create the new dummy-coded variables. Select Transform ➞ Recode ➞ Into Different Variables. Find racethn in the variable list and move it into the slot for “Numeric Variable Output Variable.” On the right, under Output Variable, type in a name for your first new variable (e.g., black, afamer). In the slot for Label, you can type in a longer label, such as “African American” if you so choose. Then click Change, which will confirm the new variable as the output variable. Next, click the “Old and New Values” button, which will bring up a new dialog box. Under “Old Value,” enter 1, which is the code for African Americans in the original (“old”) variable racethn. Then, on the right under New Value, click “Copy old value,” and then click the Add button. Women who were coded 1 on racethn will also be coded 1 on the new African-American variable. Next, under Old Value, click Range and enter 2 and then “through” 3 (the two codes for Hispanic and White/other women on racethn). On the right under New Value, enter 0, then click Add, which will code all non–African-American women as 0. Finally, under Old Value, click “System- or user-missing,” and then under New Value click “System missing,” then click Add. Women who have missing data for racethn will now have a missing values code for the new variable. Click Continue to go back to the original dialog box, then click OK to run the command. If you look at the last variable in your file in Data View, you should see the new variable with values of .00 and 1.00 (which can be changed to 0 and 1 by going into Variable View and changing the number of decimals to 0). Do the analogous procedure for the next new race/ethnic group— except remember to change the values coded 1 and 0. Now, run frequencies on your new variables, and compare the results to those from Exercise B1, making sure that your new variables accurately reflect the original racial/ethnic distribution.
For these exercises, you will be using the SPSS dataset Polit2SetC to do multiple regression analyses to predict level of depression in the sample of low-income urban women. You will need to begin by dummy coding the variable race/ethnicity. First, create a frequency distribution for racethn (Analyze ➞ Descriptive Statistics ➞ Frequencies), then answer these questions: (a) What percentage of women in this sample was African American, Hispanic, and White or other? (b) Are there any women whose information for racethn is missing? (c) How many new variables need to be created so that race/ethnicity can be used in regression analyses? (d) Which category do you think should be omitted?