Probability under null | 0.659 | 0.660 | 0.661 | \(\cdots\) | 0.717 | 0.718 | 0.719 |
---|---|---|---|---|---|---|---|
Two-sided p-value | 0.0388 | 0.0453 | 0.0514 | \(\cdots\) | 0.0517 | 0.0458 | 0.0365 |
Plausible? (\(\alpha = 0.05\)) | No | No | Yes | \(\cdots\) | Yes | No | No |
When a null distribution is bell-shaped, about 95% of the statistics will fall within 2 standard deviations of the mean with the other 5% outside this region.
\[ \hat{p} \pm 2 \times \text{SD}_\text{null} \]
Use the One Proportion Applet to generate a null distribution for \(H_0 : \pi = 0.5\), and record \(\text{SD}_\text{null} \approx 0.016\).
Using the 2SD method on our ACA data we get a 95% confidence interval \[ 0.69 \pm 2(0.016) = 0.69 \pm 0.032 = (0.658, 0.722) \]
The 0.032 in the above is called a margin of error.
This interval is close to what we got using plausible values: (0.661, 0.717).
\[ \hat{p} \pm \text{(multiplier)} \times \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]
1-sample proportions test with continuity correction
data: 714 out of 1034, null probability 0.5
X-squared = 149.37, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.6611898 0.7184201
sample estimates:
p
0.6905222
R is actually using a slightly more accurate theory-based method.
The Gallup organization conducted a survey with a random sample of 1,019 adult Americans on December 10-12, 2010. They found that 80% of the respondents agreed with the statement that the United States has a unique character that makes it the greatest country in the world.
In the WeBWorK preview, we found that \(\pi = 0.775\) was plausible, but \(\pi = 0.5\) is not plausible. We also recorded an \(\text{SD}_\text{null} = 0.016\) using \(\pi = 0.5\) and \(\text{SD}_\text{null} = 0.013\) using \(\pi = 0.775\).
2SD method. We can construct a 95% confidence interval of plausible values for a parameter by including all values that fall within 2 standard deviations of the sample statistic.
\[ \mbox{observed statistic} \pm 2 \times (\mbox{SD of null distribution}) \]
In a random sample of 1,019 adult Americans, 80% of the respondents agreed with the statement that the United States has a unique character that makes it the greatest country in the world.
Determine a 95% confidence interval using the 2SD method. Use \(\pi = 0.5\) to simulate a null distribution to get \(\text{SD}_\text{null}\). Record the confidence interval in \(\pm\) notation.
Interpret the confidence interval in the context of this problem: We are 95% confident that _____ is between _____ and _____. Everyone should type up this sentence using this form.
An estimate of the standard deviation of a statistic, based on sample data, is called the standard error (SE) of the statistic.
\[ (\mbox{Standard Error of } \hat{p}) = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]
Join your table group for the remaining questions.
The theory-based approach for finding a confidence interval for \(\pi\) (called a one-sample z-interval) is considered valid if there are at least 10 observational units in each category of the categorical variable (i.e., at least 10 successes and at least 10 failures).
Change the confidence level in the applet from 95% to 99% and press the Calculate CI button again. Record the 99% confidence interval given by the applet.
How does it compare to the 95% interval? (Compare both the midpoint of the interval = (lower endpoint + upper endpoint)/2 and the margin of error = (upper endpoint - lower endpoint)/2.)
1-sample proportions test with continuity correction
data: 815 out of 1019, null probability 0.5
X-squared = 365.16, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.7736183 0.8236924
sample estimates:
p
0.7998037
1-sample proportions test with continuity correction
data: 815 out of 1019, null probability 0.5
X-squared = 365.16, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
99 percent confidence interval:
0.7651051 0.8305648
sample estimates:
p
0.7998037
Use your confidence intervals from #4 and #6 to answer the following questions.
At a signficance level of \(\alpha = 0.05\), would you reject \(H_0 : \pi = 0.83\) in favor of the alternative \(H_a : \pi \neq 0.83\)?
At a signficance level of \(\alpha = 0.01\), would you reject \(H_0 : \pi = 0.83\) in favor of the alternative \(H_a : \pi \neq 0.83\)?
Question: How much does a used Honda Civic cost?
The following histogram displays data for the selling price of 102 Honda Civics that were listed for sale on the Internet.
Remember the basic form of a confidence interval is:
\[
\text{statistic} \pm (\text{multiplier}) \times (\text{SD of statistic})
\]
In our case, the statistic is \(\bar{x}\), so we write our 2SD confidence interval as: \[ \bar{x} \pm 2 \times (\text{SD of } \bar{x}) \]
We need a way to estimate SD of \(\bar{x}\).
Important: The SD of \(\bar{x}\) and the SD of our sample (\(s = \$4,535\)) are not the same.
There is more variability in the data (the car-to-car variability) than in sample means.
We can approximate the variability in the sample means as \(s / \sqrt{n}\).
So we can write a 2SD confidence interval as: \[ \bar{x} \pm 2 \times \frac{s}{\sqrt{n}} \]
This method gives a rough approximation for a 95% CI.
The confidence interval for a population mean has the form: \[ \bar{x} \pm (\text{multiplier}) \times \frac{s}{\sqrt{n}} \]
Price
1 21990
2 21990
3 21987
4 20955
5 20955
6 19995
7 19990
8 19990
9 19975
10 18995
11 17995
12 17987
13 17495
14 17299
15 17200
16 16995
17 16995
18 16990
19 16988
20 16987
21 16987
22 16900
23 16495
24 16300
25 16288
26 16000
27 15995
28 15995
29 15990
30 15990
31 15500
32 15499
33 15490
34 15288
35 15225
36 15030
37 14999
38 14995
39 14995
40 14994
41 14988
42 14987
43 14987
44 14987
45 14987
46 13995
47 13995
48 13995
49 13995
50 13995
51 13994
52 13990
53 13988
54 13987
55 13987
56 13987
57 13900
58 13795
59 13649
60 13599
61 12995
62 12990
63 12988
64 12987
65 12950
66 12900
67 12585
68 12500
69 12495
70 12488
71 11988
72 11987
73 10995
74 10995
75 10992
76 10305
77 9988
78 9987
79 9950
80 9900
81 9275
82 8997
83 8992
84 8979
85 8395
86 8395
87 7995
88 7995
89 7500
90 7500
91 6995
92 6900
93 6700
94 6450
95 6200
96 4995
97 4995
98 3975
99 3975
100 3000
101 2950
102 1200
[1] 13292.33
[1] 4534.568
2SD interval: \(13292 \pm 2 \times \frac{4534.568}{\sqrt{102}} \approx (12394, 14190)\)
We can just paste the Used Car data into the Theory-based inference applet.
One Sample t-test
data: carPrices$Price
t = 29.605, df = 101, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
12401.66 14183.01
sample estimates:
mean of x
13292.33
Wake up and complete the following questions on the Jamboard with your table group.
In Exploration 2.2, we collected data on how many hours of sleep we got. Use the Descriptive Statistics applet to get \(n\), \(\bar{x}\), and \(s\) for our class data. Record these values.
Based on this data, what is the best estimate for \(\mu\), the average amount of sleep Westmont students get? (Hint: it’s one of the statistics from #1.)
It was tricky to generate a null distribution to estimate \(\text{SD}_\text{null}\), the standard deviation of sample means. Instead, record the standard error, \(s/\sqrt{n}\).
The theory-based interval for a population mean (called a one-sample t-interval) requires that the quantitative variable should have a symmetric distribution or you should have at least 20 observations and the sample distribution should not be strongly skewed.
Compute a 95% 2SD confidence interval for \(\mu\), using the standard error you recorded in #3. Record this interval.
Complete this sentence, recording what goes in the blanks. We are 95% confident that _____ is between _____ and _____.
We can also use the Theory-Based Inference applet to get confidence intervals. Use the One Mean scenario, check the boxes for Paste Data and Includes header, Clear the sample data, and paste in our class data. Record the 95% theory-based t-interval from the applet. Compare it to the 2SD interval that you calculated in #5-#6.
Record a 90% confidence interval also.
Which interval is wider, the 95% or the 90%? Explain why it makes sense that this interval is wider.
Use your CI’s from #7 and #8 to answer the following.
At a signficance level of \(\alpha = 0.05\), would you reject \(H_0 : \mu = 7.8\) in favor of the alternative \(H_a : \mu \neq 7.8\)?
At a signficance level of \(\alpha = 0.10\), would you reject \(H_0 : \mu = 7.8\) in favor of the alternative \(H_a : \mu \neq 7.8\)?
Record your answers. Write a sentence or diagram explaining how you can use a confidence interval to determine the result of a hypothesis test.
sleepDF <- read.table("https://math.westmont.edu/ma5/classSleep.txt", header = TRUE)
t.test(sleepDF$SleepHours)
One Sample t-test
data: sleepDF$SleepHours
t = 45.562, df = 57, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
7.17449 7.83413
sample estimates:
mean of x
7.50431
This is more accurate than the applet.
You can change the confidence level:
One Sample t-test
data: sleepDF$SleepHours
t = 45.562, df = 57, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0
90 percent confidence interval:
7.228916 7.779705
sample estimates:
mean of x
7.50431
You can set the null value for a two-sided hypothesis test:
One Sample t-test
data: sleepDF$SleepHours
t = -1.7952, df = 57, p-value = 0.07792
alternative hypothesis: true mean is not equal to 7.8
90 percent confidence interval:
7.228916 7.779705
sample estimates:
mean of x
7.50431