How to Find Confidence Intervals in R

A confidence interval is a range of values used to estimate an unknown population parameter, such as the mean or proportion. It shows how much uncertainty is associated with a sample estimate. The interval is calculated from sample data and includes a confidence level, usually 95%, which means that if we repeat the sampling process many times, about 95% of the calculated intervals would contain the true value.

\text{Confidence Interval} = \bar{x} \pm z \cdot \left( \frac{s}{\sqrt{n}} \right)

Parameters:

\bar{x} = sample mean
z = z-score (from standard normal distribution for given confidence level, e.g., 1.96 for 95%)
s = sample standard deviation
n = sample size

Method 1: Using t.test() to Find Confidence Interval

We use the t.test() function to find the confidence interval of the sample mean in R programming language.

t.test(): Performs a t-test and returns confidence intervals.
$conf.int: Extracts the confidence interval from the result.

data <- c(23, 28, 32, 27, 25, 30, 31, 29, 26, 24)

result <- t.test(data)

confidence_interval <- result$conf.int

print(confidence_interval)

Output:

We can say with 95% confidence that the true mean lies between 25.33415 and 29.66585.

Method 2: Manually Calculating Confidence Interval (Using Iris Dataset)

We manually calculate the confidence interval for the mean of Sepal.Length from the iris dataset using statistical formulas.

mean(): Computes the sample mean
length(): Finds the number of observations
sd(): Calculates sample standard deviation
sqrt(): Computes square root
qt(): Calculates t-score for a given confidence level

mean_value <- mean(iris$Sepal.Length)

n <- length(iris$Sepal.Length)

standard_deviation <- sd(iris$Sepal.Length)

standard_error <- standard_deviation / sqrt(n)

alpha <- 0.05
degrees_of_freedom <- n - 1
t_score <- qt(p = alpha / 2, df = degrees_of_freedom, lower.tail = FALSE)

margin_error <- t_score * standard_error

lower_bound <- mean_value - margin_error
upper_bound <- mean_value + margin_error

print(c(lower_bound, upper_bound))

Output:

[1] 5.709732 5.976934

We can be 95% confident that the population mean of Sepal.Length lies between 5.709732 and 5.976934.

Method 3: Using confint() with Linear Model

We fit a simple linear model to calculate the mean and use confint() to get the confidence interval.

lm(): Creates a linear model
confint(): Returns confidence intervals for model parameters
level: Defines the confidence level (default is 0.95)

model <- lm(Sepal.Length ~ 1, iris)

confint(model, level = 0.95)

Output:

We are 95% confident that the true population mean of Sepal.Length lies between 5.709732 and 5.976934. This matches the results we got in the manual method.

How to Find Confidence Intervals in R

Method 1: Using t.test() to Find Confidence Interval

Method 2: Manually Calculating Confidence Interval (Using Iris Dataset)

Method 3: Using confint() with Linear Model

Explore