Statistical Measures in R: Average, Variance and Standard Deviation

Last Updated : 19 Feb, 2026

Statistical measures such as average, variance and standard deviation are fundamental tools in data analysis. They help summarize numerical data, understand central tendency and measure how spread out the data is. In R, these measures can be calculated easily using built-in functions.

  • Mean (Average): Measures the center of the data
  • Variance: Measures how far values spread from the mean
  • Standard Deviation: Measures overall data dispersion

Average in R

The mean is a measure of central tendency. It is calculated by dividing the sum of all observations by the total number of observations. R provides the built-in function mean() to calculate the average of a numeric vector.

Syntax

mean(x, na.rm = FALSE)

Parameters:

  • x: Numeric Vector
  • na.rm: If TRUE, ignores missing values (NA)

Example: R provides the built-in function mean() to compute the average.

R
# Create a numeric vector
data <- c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculate mean
mean(data)

Output
[1] 5

Variance in R

Variance measures how far each number in the set is from the mean. It is the average of the squared differences from the Mean. We can calculate the variance by using var() function in R.

Syntax

var(x)

Where, x: numeric vector

Example:

R
data <- c(2, 4, 4, 4, 5, 5, 7, 9)

var(data)

Output
[1] 4.571429

Note: R calculates sample variance (divides by n-1). For population variance, multiply by (n-1)/n.

Standard Deviation in R

Standard Deviation is the square root of variance. It is a measure of the extent to which data varies from the mean. One can calculate the standard deviation by using sd() function in R.

Syntax

sd(x)

Parameters:

  • x: numeric vector

Example:

R
data <- c(2, 4, 4, 4, 5, 5, 7, 9)

sd(data)

Output
[1] 2.13809

Calculating All Three Measures for a Dataset

Let’s calculate the mean, variance and standard deviation for the following dataset:

R
data <- c(12, 15, 18, 22, 30, 35)

mean_value <- mean(data)
variance_value <- var(data)
sd_value <- sd(data)

print(paste("Mean:", mean_value))
print(paste("Variance:", variance_value))
print(paste("Standard Deviation:", sd_value))

Output
[1] "Mean: 22"
[1] "Variance: 79.6"
[1] "Standard Deviation: 8.92188320927819"

Visualizing Mean, Variance and Standard Deviation

We can visualize these measures using a density plot with ggplot2

R
library(ggplot2)

# Generate 100 random data points with mean=50 and sd=10
set.seed(123)
d <- rnorm(100, 50, 10)

# Calculate mean, variance, and standard deviation
m <- mean(d); v <- var(d); s <- sd(d)

# Create the plot
ggplot(data.frame(d), aes(d)) +
  geom_density(fill = "lightblue", alpha = 0.5) +
  
  # Add vertical lines for mean and standard deviation boundaries
  geom_vline(xintercept = c(m, m + s, m - s),
             color = c("red", "green", "green"),
             linetype = c("dashed", "dotted", "dotted"),
             linewidth = c(1.2, 1, 1)) +
  
  labs(title = "Visualization of Mean, Variance, and Standard Deviation",
       x = "Data Values", y = "Density") + 
  theme_minimal() +
  
  # Annotate mean, mean ± SD, and variance
  annotate("text", x= m, y = 0.03, label = paste("Mean =", round(m, 2)), color = "red", vjust = -1) +
  annotate("text", x= m + s, y = 0.02, label = paste("Mean + SD =", round(m + s, 2)), color = "green", vjust = -1) +
  annotate("text", x= m - s, y = 0.02, label = paste("Mean - SD =", round(m - s, 2)), color = "green", vjust = -1) +
  annotate("text", x= m + 20, y = 0.04, label = paste("Variance =", round(v, 2)), color = "blue", vjust = -1)

Output:

plot_of_mean_variance_sd
Visualizing Mean, Variance, and Standard Deviation
  • The mean as a red dashed line in the center of the distribution.
  • The standard deviation lines as green dotted lines on both sides of the mean, indicating the spread of the data.
  • Variance annotation: The annotate() function adds a label showing the variance in blue text at a specified location (in this case, to the right of the mean).

This visualization provides the way to see how the data is distributed around the mean and how spread out it is using the standard deviation. The variance is inherently visualized as part of the spread between the standard deviation lines.

Comment

Explore