In this dataset you should find an ESRI readable coverage
of
Comments
This is the ‘stripped down’ version of the lab. All that this document includes is the questions you are supposed to answer for the lab. This document does not include “Betty Crocker” instructions as to how to do the lab (i.e. the data manipulations, GIS commands, and JMP analyses). You will learn more if you figure out how to do this on your own or via collaboration with other students in the class. “Betty Crocker” explanations have been produced by students over the years and are available via the course web site. You will be involved in producing a “Betty Crocker” set of instructions for one of these labs. These are made for your use but they are an “as is” product. The seventeen questions start on the next page. Good luck.
1) Were the home locations of the respondents to this survey spatially random?
(you don’t have to do the analysis on this question, just answer it, & explain how you would have done the analysis if you had to)
2) Were the home locations of the respondents to this survey random with respect to population density? (you do have to do the analysis for this one)
3) What level of spatial aggregation (tracts or block-groups) is more appropriate to answering question #2. Explain.
4) How
would you test the following?: a) Is the age
distribution of the respondents to this survey significantly different than the
age distribution of the population of
5) What kinds of problems do you run into when trying to answer the questions posed in #4?
6) Are the respondents to this survey age and income independent? Would you expect them to be? Is this a parametric or non-parametric test?
Simple Demographic Comparisons (7-11)
7) Based on the responses to the question: ‘Aabortion should remain legal as defined in “Roe v. Wade”?’; are Democrats significantly more ‘Pro-Choice’ than Republicans?
8) Along a similar vein, are Women more ‘Pro-Choice’ than Men? (according to this survey)
9) In a separate survey I found that Women were more ‘Pro-Choice’ than men and that Catholic women were significantly ‘More, more Pro-Choice’ than Catholic men. Is this true of the respondents to this survey? How did you test that? If you did find the gap between Catholic men and women significantly greater than the gap between men and women in general what would a statistician call such a phenomena? If it were true, how would you explain it?
10) Are republicans different than non-republicans on the responses to any of the questions about immigration?
11) Is there any relationship between ‘Religiosity’ and responses to the question: ‘The earth has a finite supply of natural resources such as water, arable land, etc. which imposes a limit on the number of people which can sustainabily live on it.’
Factor Analysis (12-14).
Factor analysis is a data reduction technique that allows you to ‘compress’ your analysis. As you can imagine we could ask hundreds, if not thousands of questions of this dataset (e.g. are men different than women on questions 1-50, are Catholics different than Protestants on questions 1-50, there’s a hundred right there). However, as you can imagine, people will have similar responses to many of the questions. Factor analysis allows you to capture the co-variance that usually exists between questions. For example, there are 5 questions about immigration or immigration policy; an anti-immigration person will most likely respond in a similar manner to all the questions; consequently, only one question might be necessary to ‘capture’ such a response. Factor analysis is a means of ‘capturing’ this co-variance between questions and ‘reducing’ a many-question survey to a few factors. Labeling or ‘Naming’ these factors is one of the ‘arts’ of statisticians. It is now your turn to practice this ‘art’.
12) Run a
factor analysis on the responses to the questions appropriate to such an
analysis (we’ll decide these in class). Identify the questions with a
factor contribution score of 0.40 or more for each factor and list the
questions associated with factors 1-5. Be sure to save each respondent’s factor
scores on each of Factors 1-5. For Factors 1-5 list the questions with a factor
contribution score of 0.40 or more and study the questions that contributed to
each factor. As a result of this study provide a name for each of the first
five factors. (FYI, potential ‘names’ for factors when I analyzed this survey
were: “Faith in Government”, “Belief in Adam Smith’s ‘invisible hand’”, and
“Keep those Mexicans out of
13) Do all the statistical tests necessary to fill out the table below. Put an asterisk (*) in the cells that indicate any significant differences on factor scores between demographic varibales. For each asterisk provide a detailed description of the nature of the significant differences and some guess as to an explanation for the differences. Your ‘guess’ is referred to as ‘theory’ in academia. If you are really fired up about this exercise find references to support your theory.
Significant Factor Score
Differences (*) |
Sex |
Pol. Party |
Religion |
Religiosity |
Income |
Education |
Race/Ethnicity |
Factor 1: |
|
|
|
|
|
|
|
Factor 2: |
|
|
|
|
|
|
|
Factor 3: |
|
|
|
|
|
|
|
Factor 4: |
|
|
|
|
|
|
|
Factor 5: |
|
|
|
|
|
|
|
14) Did filling out the table and answering the questions of #13 make you appreciate factor analysis? (Explain. If your answer is ‘No’ stop by my office for a spanking J).
Spatial Anaysis: Where’s the Geography?
So far, the
analyses performed up to this point could have been done in a sociology
department. The only ‘geographic’ analyses were the questions about randomness
of the survey respondents with respect to population density and space. True
spatial anlysis of surveys can shed light on
interesting questions about the location of the respondent’s home or workplace
relative to questions in the survey. For example: Does the distance of a respondent’s home
and/or work location influence their likelihood to support or use a light rail
public transportation system? Does the population density of their home
location or home city covary with attitudes about
population growth and policy? Does the Hispanic proportion of their home neighborhood
have any influence on their attitudes about
14) Test for any significant differences/variation for all of the factor scores (1-5) and the population density and percent non-white of the respondents home location. If you find any significant differences provide an explanation?
15) Another test you could do is to test for increases variance in response based on a geographic attribute. For example, suppose that people’s responses to the 5 immigration questions became increasingly extreme (i.e. more 1’s (strongly agree) and 5’s(strongly disagree)) but the mean remained the same as the Hispainic proportion of the population in the respondent’s home location increased. What kind of statistical test would you use to look for that and if it proved significant, what would the explanation be?
General Questions
16) Describe 10 specific problems related to this little research project. Things to consider: Sampling frame was registered voters whereas census data was total population, Non-response Bias, etc.
17)Are these problems significant enough to invalidate any or all of the findings from an analysis of this data?