We were presented with an excel sheet of data and was asked to analyze it according to three different parts over seven weeks. The purpose of this analysis is to determine several factors which are marked by bold faced print throughout the report. The following is my attempt at this analization.
Part 1: Due Week 3
I was asked to use the surveyed data to figure all of the data needed.
Descriptive Statistics:
Annual Household Income | Home Price | No. Bedrooms | Household size | |
Count | 100 | 100 | 100 | 100 |
Mean | 131.313 | 289.991 | 3.18 | 3.63 |
Variance | 655.021 | 1,261.221 | 0.77 | 1.81 |
Standard Deviation | 25.593 | 35.514 | 0.88 | 1.35 |
Minimum | 72.4 | 210.3 | 2.00 | 1.00 |
Maximum | 261.75 | 381.5 | 5.67 | 7.00 |
Range | 189.35 | 171.2 | 3.67 | 6.00 |
1^{st} quartile | 118.813 | 271.300 | 2.50 | 3.00 |
Median | 131.350 | 287.300 | 3.00 | 3.50 |
3^{rd} quartile | 141.675 | 306.900 | 4.00 | 4.25 |
Interquartile Range | 22.863 | 35.600 | 1.50 | 1.25 |
Mode | 138.100 | 301.600 | 2.50 | 3.00 |
Race: See Appendix 1
Info:
Mean Annual household income- $131.313
Mean Home price-$298, 991
Mean No of bedrooms-3.18
Mean household size 3.63
Standard deviation of annual household income is $25,593
Standard deviation of home price-$35,514
Standard deviation on No of Bedrooms-0.88
Standard deviation for household size is 1.35
Heart Disease:
Had Heart Disease | Frequency | Relative Frequency |
No | 42 | 42% |
Yes | 58 | 58% |
Grand Total | 100 | 1.00 |
Section 2-Frequency Chart See attachment in excel (I was unable to copy and paste this into the assignment)
Part 2: Due Week 5
- Test the hypotheses that the average household income in the township is greater than $100,000. Hint: Start by stating the null and alternative hypothesis.
The null and alternative hypothesis are:
H_{0: }_{The average household income of the township is equal to $100,000: U=100.}
H_{a}: The average household income in the township is greater than $100,000 : u>100
I chose a=0.05 as my level of significance
Since n>30, use z-test for the single mean. The critical value for this right tailed test is 1.645.
The rejection region for this test is z>1.645.
( ii) Construct a 95% CI for the proportion of households with family history of heart disease, separately for each race. Would you say the proportion of households with history of heart disease differ by race?
Race | Had Heart Disease | Grand Total | |
No | Yes | ||
Black | 9 | 20 | 29 |
Hispanic | 7 | 14 | 21 |
Caucasian | 26 | 24 | 50 |
Grand Total | 42 | 58 | 100 |
The 95% CI for the following households for family history of heart disease are as follows:
Black: Sample proportion-0.69, sample size-29, lower limit-0.52, upper limit-0.86
Hispanic: sample proportion-0.67, sample size-2, lower limit-0.47, upper limit-0.87
White: sample proportion-0.48, sample size-50, lower limit-0.34, upper limit-0.62
The proportion of households is the same for all three races.
Part 3: Due week 7
You believe the income differ by race.
- Formulate the null and alternative hypothesis you would use for this claim.
The Null & alternative hypothesis are as follows:
H_{0: }_{There is no significant difference between income and distributions by race; u1=u2=u3.}
H_{a}: The income distributions differ by race
- Conduct the test and appropriately accept or reject your null hypothesis using ANOVA
I chose a=0.05 as the level of significance:
ANOVA-single factor
Summary:
Race | count | sum | average | Varience |
Black | 29 | 3794.75 | 130.853 | 803.700 |
Hispanic | 21 | 2782.35 | 132.493 | 523.546 |
White | 50 | 6554.2 | 131.084 | 649.685 |
ANOVA
Source of Variation | SS | df | MS | F | P-value | F crit |
Between Groups | 37.98 | 2 | 18.99 | 0.028 | 0.9720 | 3.090 |
Within Groups | 64809.09 | 97 | 668.13 |
Total | 64847.07 | 99 |
- Is there any statistical significant difference in the average household income based on race?
The table above contains the test statistic F (2,97)=0.028, p>0.95. This data, unfortunately, provides insufficient evidence to support the claim that there is a difference in household income by race.
References:
Gerstman, B. Burt. (2008). Basic biostatistics for public health practice. Boston, Toronto, London, Singapore. Jones and Bartlett publishers.
Stat Trek. (n.d.). What is hypothesis testing? Retrieved from www.statrek.com
Pezzullo, John. (2013). Biostatistics for dummies. Hoboken, New Jersey. John Wiley and Sons Publishing.
Boonshoft School of Medicine. (n.d.) Principles of biostatistics or medicine. Power point presentation. Retrieved from https://www.medwright.edu.
Appendix 1 Race Use pie chart for this.