Correlation and regression analysis (also called "least squares" analysis) helps us examine relationships among interval or ratio variables. As you will see, results of these two tests tell us slightly different things about the relationship between two variables. In this section, we will explore techniques for doing correlation and bivariate regression.
Correlation
How does education influence the types of occupations that people enter ? One way to think about occupations is in terms of “occupational prestige.” Your data set includes a variable, PRESTG80, in which a prestige score was assigned to respondents’ occupations, where higher numbers indicate greater prestige.
Let’s hypothesize that as education increases, the level of prestige of one’s occupation also increases. To test this hypothesis, click on "Analyze," "Correlate," and "Bivariate." Click on EDUC, and then click the arrow to move it into the box. Do the same with PRESTG80.
The most widely used bivariate test is the Pearson correlation. It is intended to be used when both variables are measured at either the interval or ratio level, and each variable is normally distributed. However, sometimes we do violate these assumptions. If we do a histogram of both EDUC, in PRESTG80, we will notice that neither is actually normally distributed. Furthermore, if we noted that PRESTG80 is really an ordinal measure, not an interval one, we would be correct. Nevertheless, most analysts would use the Pearson correlation because the variables are close to being normally distributed, the ordinal variable has many ranks, and because the Pearson correlation is the one they are used to. SPSS includes another correlation test, Spearman’s rho, that is designed to analyze variables that are not normally distributed, or are ranked, as in PRESTG80. We will conduct both tests to see if our hypothesis is supported, and also to see how much the results differ depending on the test used – in other words, whether those who use the Pearson correlation on these types of variables are seriously off base.
In the dialog box, the box next to Pearson is already checked, as this is the default.
The correlation coefficient may range from –1 to 1, where –1 or 1 indicates a “perfect” relationship. The further the coefficient is from 0, regardless of whether it is positive or negative, the stronger the relationship between the two variables. Thus, a coefficient of .453 is exactly as strong as a coefficient of -.453. Positive coefficients tell us there is a direct relationship: when one variable increases, the other increases. Negative coefficients tell us that there is an inverse relationship: when one variable increases, the other one decreases. Notice that the Pearson coefficient for the relationship between education and occupational prestige is .520, and it is positive. This tells us that, just as we predicted, as education increases, occupational prestige increases. But should we consider the relationship strong? At .520, the coefficient is only about half as large as is possible. It should not surprise us, however, that the relationship is not “perfect” (a coefficient of 1). Education appears to be an important predictor of occupational prestige, but no doubt you can think of other reasons why people might enter a particular occupation. For example, someone with a college degree may decide that they really wanted to be a cheese-maker, which has an occupational prestige score of only 29, while a high-school dropout may one day become an owner of a bowling alley, which has a prestige score of 44. Given the variety of factors that may influence one’s occupational choice, a coefficient of .520 suggests that the relationship between education and occupational prestige is actually quite strong.
The correlation matrix also gives the probability of being wrong if we assume that the relationship we find in our sample accurately reflects the relationship between education and occupational prestige that exists in the total population from which the sample was drawn (labeled as Sig. (2-tailed)). The probability value is .000 (remember that the value is rounded to three digits), which is well below the conventional threshold of p < .05. Thus, our hypothesis is supported. There is a relationship (the coefficient is not 0), it is in the predicted direction (positive), and we can generalize the results to the population (p < .05).
Recall that we had some concerns about using the Pearson coefficient, given that PRESTG80 is measured as an ordinal variable. Notice that the coefficient, .523, is nearly identical to coefficient obtained using the Pearson correlation.
No comments:
Post a Comment