Two sample t-test (Student’s t-test)

Here is an example of performing a two-sample t-test in R and interpreting the results.

Suppose we want to test whether the mean height of men is significantly different from the mean height of women. We have two samples: one of heights of 50 men and one of heights of 50 women. We can use a two-sample t-test to test whether the difference between the sample means is statistically significant.

# Generate fake data
set.seed(123)
men_heights <- rnorm(50, mean = 175, sd = 7)
women_heights <- rnorm(50, mean = 162, sd = 6)

# Perform two-sample t-test
t_test <- t.test(men_heights, women_heights)

# Print results
t_test

The output of the t.test() function provides a lot of information, including the sample means, the test statistic, the degrees of freedom, the p-value, and confidence intervals:

Welch Two Sample t-test

data:  men_heights and women_heights
t = 16.563, df = 97.842, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 12.35252 14.92231
sample estimates:
mean of x mean of y 
 174.5499  162.0828 

The test statistic t is calculated as the difference between the sample means divided by the standard error of the difference. The degrees of freedom df are calculated using a Welch-Satterthwaite approximation, which accounts for unequal variances in the two samples. The p-value is very small (p-value < 2.2e-16), indicating strong evidence against the null hypothesis that the mean heights of men and women are equal. The 95% confidence interval for the difference in means (12.35252 14.92231) does not contain 0, further supporting the rejection of the null hypothesis.

In summary, based on the results of the two-sample t-test, we can conclude that there is strong evidence that the mean height of men is significantly different from the mean height of women, with men being taller on average.

Krzysztof Banas
Krzysztof Banas
Principal Research Fellow

I work as beam-line scientist at Singapore Synchrotron Light Source. My research interests include application of advanced statistical methods for hyperspectral data processing (dimension reduction, clustering and identification).

Related