Section 7.3

March 25, 2021

Theory-based approach for analyzing paired data

How Many M&M’s do you want?

Does portion size affect how much you eat?

  • Brian Wansink studied this question with college students over several days.
  • At one session, the 17 participants were randomly assigned to receive either a small bowl or a large bowl and were allowed to take as many M&Ms as they would like.
  • At the following session, the bowl sizes were switched for each participant.

How many M&M’s do you want?

  • Observational units: 17 students
  • Explanatory: small bowl/large bowl
  • Response: number of M&M’s
  • Is this an experiment or an observational study?
  • Will the resulting data be paired?

Hypotheses

  • Null: Portion size is not associated to M&M consumption.
    • There is no difference in portion size.
  • Alternative: Larger portions cause you to eat more.
    • There is a difference in portion size.

Our parameter is the long-run mean difference (small minus large). In symbols:

\[ H_0 : \mu_d = 0 \\ H_a: \mu_d < 0 \]

Data

##    Small Large diff
## 1     33    41   -8
## 2     24    92  -68
## 3     35    61  -26
## 4     24    19    5
## 5     40    21   19
## 6     33    35   -2
## 7     88    42   46
## 8     36    50  -14
## 9     65    11   54
## 10    38   104  -66
## 11    28    97  -69
## 12    50    36   14
## 13    26    43  -17
## 14    34    62  -28
## 15    51    33   18
## 16    25    62  -37
## 17    26    32   -6
cat(mean(MandMs$Small), sd(MandMs$Small))
## 38.58824 16.89696
cat(mean(MandMs$Large), sd(MandMs$Large))
## 49.47059 27.20781
cat(mean(MandMs$diff), sd(MandMs$diff))
## -10.88235 36.30062

Simulation-based approach

We can copy the Data into the Matched Pairs applet to get a simulation-based p-value and 2SD confidence interval.

Theory-based alternative?

  • Our null distribution was centered at zero and fairly bell-shaped.
  • This can all be predicted (along with the variability) using theory-based methods.
  • Theory-based methods should be valid if
    • the population distribution of differences is symmetric (we would have to make this assumption) or
    • our sample size is at least 20 and the sample differences are not strongly skewed.
  • Our sample size was only 17, but this distribution of differences is fairly symmetric, so we will proceed with a theory-based test.

Distribution of differences

Paired t-test

Two options:

  1. Use the Matched Pairs applet, and calculate t-statistic.
  2. Use the Theory Based Inference Applet, and enter all the summary statistics. Here we use a test for One Mean.

Calculating t-statistic

For paired data, the \(t\)-statistic is:

\[ t = \frac{\bar{x}_d - 0}{s_d/\sqrt{n}}\approx \frac{-10.88235 - 0}{36.30062/\sqrt{17}} \approx -1.236 \]

We can enter this \(t\)-statistic into the Matched Pairs applet to get a p-value.

Alternative: TBIA one mean scenario

Alternatively, we can use the Theory Based Inference Applet, and enter all the summary statistics. Here we use a test for One Mean.

  • We use the “One Mean” scenario because we are looking at the mean of one column: the difference column.

Theory-Based Paired t-test in R

MandMs <- read.table("http://www.isi-stats.com/isi/data/chap7/BowlsMMs.txt", header=TRUE)
t.test(MandMs$Small, MandMs$Large, paired = TRUE, alternative = "less")
## 
##  Paired t-test
## 
## data:  MandMs$Small and MandMs$Large
## t = -1.236, df = 16, p-value = 0.1171
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf 4.488747
## sample estimates:
## mean of the differences 
##               -10.88235

Conclusions

  • The theory-based model gives slightly different results than simulation, but we come to the same conclusion. We don’t have strong evidence that the bowl size affects the number of M&Ms taken.
  • We can see this in the large p-value (0.1172) and the confidence interval that included zero (-29.5, 7.8).
  • The confidence interval tells us that we are 95% confident that when given a small bowl, a person will take between 29.5 fewer M&Ms to 7.8 more M&Ms on average than when given a large bowl.

Why weren’t the results significant?

There could be a number of reasons we didn’t get significant results.

  • Maybe bowl size doesn’t matter.
  • Maybe bowl size does matter and the difference was too small to detect with our small sample size.
  • Maybe bowl size does matter with some foods, like pasta or cereal, but not with a snack food like M&Ms.

Strength of Evidence

We will have stronger evidence against the null (smaller p-value) when:

  • The sample size is increased.
  • The variability of the data is reduced.
  • The mean difference is farther from 0.

We will get a narrower confidence interval when:

  • The sample size is increased.
  • The variability of the data is reduced.
  • The confidence level is decreased.

Exploration 7.3

What is a Dutch Auction?

In the Dutch auction the item for sale starts at a very high price and is lowered gradually until someone finds the price low enough to buy. In the first-price sealed bid auction each bidder submits a single sealed bid before a particular deadline. After the deadline, the person with the highest bid wins.

  • See the Preview exercise on WeBWorK

Preview

The researcher placed pairs of identical cards up for auction; one would go into the Dutch auction and the other to the first-price sealed bid auction. He then looked at the difference in the prices he received on the pair. He repeated this for a total of 88 pairs.

  • Explanatory and Response?
  • Hypotheses?

See the Preview exercise on WeBWorK

Matched pairs applet

  1. Paste the data into the Matched Pairs applet and record the following.
    • \(\bar{x}_d\)
    • \(s_d\)
    • A simulation-based p-value for the test of \(H_0: \mu_d = 0\) versus \(H_a: \mu \neq 0\).

VALIDITY CONDITIONS

Theory-based methods of inference will work well for paired data if the population distribution of differences has a symmetric distribution, or you have at least 20 pairs (i.e., at least 20 differences) and the distribution of the sample differences is not strongly skewed. This test is known as a paired t-test.

  1. Are the validity conditions met for these data?

Compute t-statistic

  1. Use the formula \(t = \frac{\bar{x}_d - 0}{s_d/\sqrt{n}}\) and record the value of the t-statistic. Do you think this value indicates strong evidence against \(H_0\)?

  2. In the applet where you got the simulation-based p-value, choose t-statistic above the null distribution. Enter your t-statistic and click Count. Click overlay t-distribution. Record the theory-based p-value that appears in yellow. (If you do it wrong, you will get a warning.)

Alternative: TBIA Applet

  1. Open the Theory-Based Inference applet. Choose One mean from the pull-down menu.
  • Enter the sample size, sample mean, and sample standard deviation for the differences as you found in #1.
  • Do a Test of Significance
  • Record a 99% Confidence Interval.

Use this form to enter a coherent sentence interpreting the endpoints of the confidence interval in the context of the problem.

Causation and Generalization

  1. Can you conclude causation? If yes, what causes what? If not, how are you deciding?

  2. Can you extend the results of this study? Other kinds of cards? Other types of items? Anything sold in an auction format on the Internet? How are you deciding?

Effect of Sample Size

Suppose that the data only contained 22 pairs, instead of the original 88, but \(\bar{x}_d\) and \(s_d\) remained the same (as recorded in #1).

  1. Predict how the t-statistic will change. Then compute the new t-statistic and see if you are right.

  2. Predict how the p-value will change. Then use the Theory-Based Inference applet to get a new p-value.

  3. Predict how the confidence interval will change. Then use the TBIA to get a new CI.

Now do it wrong

Let’s see what happens when we incorrectly use an independent samples t-test instead of a paired \(t\)-test.

  1. Paste the data into the Theory-Based Inference applet, using the Two Mean scenario, and obtain a new p-value for the Two-Mean (independent samples) \(t\)-test. Compare with #4.

  2. What are the consequences of using the wrong statistical analysis? What type of error do you risk making? Which test is more powerful?