Mean, Median and Mode in R Programming

Statistical measures like mean, median and mode are important for summarizing and understanding the central tendency of a dataset. They help describe the typical value in the data and provide a quick overview of its distribution. In R, these measures can be calculated easily using built-in functions.

Mean in R

The mean is the arithmetic average the sum of all values divided by the count of values.

Syntax

mean(x, na.rm = FALSE)

Parameters:

x: Numeric vector
na.rm: If TRUE, ignores NA values

Example:

x <- c(2, 4, 6, 8, 10)
mean(x)

# Handling NA
x <- c(2, 4, NA, 8)
mean(x, na.rm = TRUE)

Output

[1] 6
[1] 4.666667

Explanation:

In the first example, it returns the average of all values in x: (2+4+6+8+10)/5 = 6
In second example The na.rm = TRUE removes the NA, so the average is (2+4+8)/3 = 4.666667

Median in R

It is the middle value of the data set. It splits the data into two halves. If the number of elements in the data set is odd then the center element is median and if it is even then the median would be the average of two central elements.

Syntax

median(x, na.rm = FALSE)

Example:

x <- c(1, 3, 5, 7, 9)
median(x)

# With NA values
x <- c(1, NA, 5, 7)
median(x, na.rm = TRUE)

Output

[1] 5
[1] 5

Explanation:

In first example sorted list has 5 numbers, the middle one is 5
In second example After removing NA, the sorted values are (1, 5, 7), the middle value is 5

Mode in R

The mode is the value that appears most frequently in a dataset. R does not include a built-in mode function for statistical mode, but you can define one easily.

Method 1: Custom Function to Find Mode

get_mode <- function(v) {
  uniqv <- unique(v)
  uniqv[which.max(tabulate(match(v, uniqv)))]
}

x <- c(1, 2, 2, 3, 3, 3, 4)
get_mode(x)

Output

[1] 3

Explanation: The number 3 appears most frequently (3 times), so it is the mode

Method 2: Using Modeest Package

We can use the modeest package of the R. This package provides methods to find the mode of the univariate data and the mode of the usual probability distribution.

# Install and load package
install.packages("modeest")
library(modeest)

x <- c(1, 2, 2, 3, 3, 3, 4)
mfv(x)

Output

Explanation: The mfv() function from the modeest package finds the most frequent value again, 3

Mean, Median and Mode in R Programming

Mean in R

Syntax

Median in R

Syntax

Mode in R

Method 1: Custom Function to Find Mode

Method 2: Using Modeest Package

Explore