Gradient computation is the process of calculating the gradient (or vector of partial derivatives) of a function with respect to its variables. In mathematical optimization and machine learning, this often involves finding how a change in each input variable affects the output of the function. For any function f, the gradient of f, denoted as ∇f or grad f.
For a function f(x1, x2,..., xn), the gradient is represented as a vector of partial derivatives, which is given by :
∇f = (∂f/∂x1, ∂f/∂x2, . . . , ∂f/∂xn )
Where:
- f: A scalar function of multiple variables f(x1, x2, . . . , xn).
- ∇f: The gradient of the function f, which is a vector of partial derivatives.
- ∂f/∂xi: The partial derivative of f with respect to xi. It measures how f changes as xi changes while keeping all other variables constant.
- x1, x2,..., xn: The independent variables of the function f.
Here are some useful identities for Gradient Computing:
Gradient of a Sum
∇(f(x) + g(x)) = ∇f(x) + ∇g(x)
The gradient of a sum is the sum of the gradients.
Gradient of a Scalar Multiple
∇(c ⋅ f(x)) = c⋅∇f(x)
Multiplying a function by a constant scales its gradient by the same constant. Where c is a constant.
Gradient of a Product (Product Rule)
∇(f(x) ⋅ g(x)) = f(x)∇g(x) + g(x)∇f(x)
The gradient of a product involves both functions and their gradients.
Gradient of a Quotient
∇(f(x)/g(x)) = g(x)∇f(x) − f(x)∇g(x)/[g(x)]2
Assuming g(x) ≠ 0, for a quotient, the gradient uses the difference of gradients over the square of the denominator.
Gradient of a Composition (Chain Rule)
If z = f(g(x)), then:
∇z = df/dg ⋅ ∇g(x)
For functions composed of multiple variables, the gradient of a composition is the derivative of the outer function times the gradient of the inner function.
Gradient of a Dot Product (Vector-Valued Functions)
For a ⋅ x, where a is a constant vector:
∇(a ⋅ x) = a
The gradient of a dot product with a constant vector is that vector.
Gradient of a Norm (Vector-Valued Function)
For ∥x∥2, where x is a vector:
∇(∥x∥2) = 2x
The gradient of the squared norm of a vector is twice the vector.
Gradient of Logarithm
∇(log f(x)) = 1/f(x) × (∇f(x))
The gradient of the logarithm of a function is given by the product of the reciprocal of the function and its gradient.
Note: These identities help simplify and calculate gradients in various scenarios.