A covariance matrix is a square matrix that displays the covariance between each pair of variables in a dataset. It helps us understand how two variables change together. A positive value means both variables increase or decrease together, while a negative value means one increases as the other decreases. In R programming language, we can use the cov() function to create this matrix.
Syntax:
cov( df )
Parameter:
- df: determines the data frame for creating covariance matrix.
A positive value for the covariance matrix indicates that two variables tend to increase or decrease sequentially. A negative value for the covariance matrix indicates that as one variable increases, the second variable tends to decrease.
Example 1: Covariance Matrix from Static Data
We create a data frame with three numeric variables and pass it to the cov() function to generate the covariance matrix.
- data.frame: Creates a structured table with named numeric vectors.
- var1, var2, var3: Column names representing different numeric variables.
- cov(): Calculates pairwise covariances between columns in the data frame.
sample_data <- data.frame(
var1 = c(86, 82, 79, 83, 66),
var2 = c(85, 83, 80, 84, 65),
var3 = c(107, 127, 137, 117, 170)
)
cov(sample_data)
Output:

- Diagonal values show variances of var1, var2 and var3.
- 63.9 shows strong positive relation between var1 and var2.
- -185.9 and -192.8 show negative relation between var3 and var1 or var2.
Example 2: Covariance Matrix from Random Data
We use the rnorm() function to generate two columns of random numeric data and pass them to the cov() function to compute the matrix.
- rnorm(): Generates random numbers from a normal distribution.
- n = 20: Number of observations.
- mean, sd: Mean and standard deviation for each column.
- cov(): Computes covariances between the generated variables.
sample_data <- data.frame(
var1 = rnorm(20, 5, 23),
var2 = rnorm(20, 8, 10)
)
cov(sample_data)
Output:

- Diagonal values show variances of var1 and var2.
- -14.66 shows weak negative relation between var1 and var2.
- Values change on every run as data is random.