Part 1: Preliminary Report 1 DUE WEEK 3 – No Grade Given – Pass/Fail – OPTIONAL – NO LATE ASSIGNMENTS ACCEPTED
(i) Describe the household data using the appropriate descriptive statistics technique for each variable. Hint: Calculate the mean, median, mode, proportion, range, and standard deviation, etc. as appropriate.
Table 1 contains the descriptive statistics for all the quantitative variables present in the data.
Descriptive statistics | ||||
Annual Household Income | Home Price | No. Bedrooms | Household Size | |
Count | 100 | 100 | 100 | 100 |
Mean | 131.313 | 289.991 | 3.18 | 3.63 |
Variance | 655.021 | 1,261.221 | 0.77 | 1.81 |
Standard Deviation | 25.593 | 35.514 | 0.88 | 1.35 |
Minimum | 72.4 | 210.3 | 2.00 | 1.00 |
Maximum | 261.75 | 381.5 | 5.67 | 7.00 |
Range | 189.35 | 171.2 | 3.67 | 6.00 |
1^{st} quartile | 118.813 | 271.300 | 2.50 | 3.00 |
Median | 131.350 | 287.300 | 3.00 | 3.50 |
3^{rd} quartile | 141.675 | 306.900 | 4.00 | 4.25 |
Interquartile Range | 22.863 | 35.600 | 1.50 | 1.25 |
Mode | 138.100 | 301.600 | 2.50 | 3.00 |
Table 1
The mean annual household income is $131.313, mean home price is $289.991, mean number of bedrooms is 3.18, and mean household size is 3.63.
The median annual household income is $131.350, median home price is $287.300, median number of bedrooms is 3.00, and median household size is 3.50.
The standard deviation of annual household incomes is $25.593, standard deviation of home prices is $35.514, standard deviation of number of bedrooms is 0.88, and standard deviation of household sizes is 1.35.
Table 2 contains the frequency and relative frequency distributions of all the qualitative variables present in the data.
Race | Frequency | Relative frequency |
Black | 29 | 29.0% |
Hispanic | 21 | 21.0% |
White | 50 | 50.0% |
Grand Total | 100 | 1.00 |
Had Heart Disease? | Frequency | Relative frequency |
No | 42 | 42.0% |
Yes | 58 | 58.0% |
Grand Total | 100 | 1.00 |
Table 2
Overall in the population, 29% are Blacks, 21% are Hispanics, and 50% are Whites.
Among the individuals in the population 42% did not have heart disease earlier and 58% had a heart disease.
(ii) Construct a frequency table and histogram of household income in increments of $50,000 (for example, $0–$50,000, $50,001–$100,000, $100,001-$150,000, and so on).
Table 3 contains the frequency table of household income in increments of $50,000.
Frequency Distribution – Annual Household Income | |||||||
lower | upper | midpoint | width | frequency | percent | ||
0 | < | 50 | 25 | 50 | 0 | 0.0 | |
50 | < | 100 | 75 | 50 | 10 | 10.0 | |
100 | < | 150 | 125 | 50 | 76 | 76.0 | |
150 | < | 200 | 175 | 50 | 12 | 12.0 | |
200 | < | 250 | 225 | 50 | 1 | 1.0 | |
250 | < | 300 | 275 | 50 | 1 | 1.0 | |
100 | 100.0 |
Table 3
Figure 1 contains the frequency histogram of household income in increments of $50,000.
Figure 1
Part 2: Preliminary Report 2 DUE WEEK 5 – No Grade Given – Pass/Fail – OPTIONAL
NO LATE ASSIGNMENTS ACCEPTED
You have already compute sample parameters for the data.
(i) Test the hypotheses that the average household income in the township is greater than $100,000. Hint: Start by stating the null and alternative hypotheses.
(ii) Construct a 95% CI for the proportion of households with family history of heart disease, separately for each race. Would you say the proportion of households with history of heart disease differ by race?
Table 4 contains the race wise proportions of having heart disease.
Race | Had Heart Disease | Grand Total | |
No | Yes | ||
Black | 9 (31%) | 20 (69%) | 29 |
Hispanic | 7 (33%) | 14 (67%) | 21 |
White | 26 (52%) | 24 (48%) | 50 |
Grand Total | 42 (42%) | 58 (58%) | 100 |
Table 4
Race | Sample Proportion | Sample Size | 95% CI | |
Lower Limit | Upper Limit | |||
Black | 0.69 | 29 | 0.52 | 0.86 |
Hispanic | 0.67 | 21 | 0.47 | 0.87 |
White | 0.48 | 50 | 0.34 | 0.62 |
Table 5
Table 5 contains the 95% CI for the proportion of households with family history of heart disease, separately for each race. The proportion of households with history of heart disease is same for all the races Black, Hispanic and White.
Part 3: Final Report DUE WEEK 7
You believe that income distribution differ by race.
(i) Formulate the null and alternative hypotheses you would use to for the test this claim.
(ii) Conduct the test and appropriately accept or reject your null hypotheses using ANOVA.
Table 6 contains the results of one-way ANOVA performed on annual household incomes by the variable race.
Anova: Single Factor | ||||||
SUMMARY | ||||||
Groups | Count | Sum | Average | Variance | ||
Black | 29 | 3794.75 | 130.853 | 803.700 | ||
Hispanic | 21 | 2782.35 | 132.493 | 523.546 | ||
White | 50 | 6554.2 | 131.084 | 649.685 | ||
ANOVA | ||||||
Source of Variation | SS | df | MS | F | P-value | F crit |
Between Groups | 37.98 | 2 | 18.99 | 0.028 | 0.9720 | 3.090 |
Within Groups | 64809.09 | 97 | 668.13 | |||
Total | 64847.07 | 99 |
Table 6
(iii) Is there any statistically significant difference in the average household income based on race?
Table 6 contains the test statistic F (2, 97) = 0.028, p > 0.95. It indicates that data provides insufficient evidence to support the claim that there is a statistically significant difference in the average household income based on race.