Orthogonal Distance Regression (ODR) is a powerful statistical technique used to fit a model to data when both independent (X) and dependent (Y) variables are subject to error. Unlike traditional Ordinary Least Squares (OLS), which assumes that only the dependent variable has measurement errors, ODR accounts for errors in both directions, making it ideal for scientific and engineering data where all measurements can be noisy.

Why Use ODR Instead of OLS?
In many real-world scenarios, both the independent variable (X) and the dependent variable (Y) may be affected by measurement errors. In such cases, ODR becomes more suitable because it:
- Accounts for errors in both X and Y
- Provides a more geometrically accurate fit
- Is capable of handling non-linear models
Mathematical Formulation
The objective function minimized in ODR is:
\sum_{i=1}^{n} \left[ \frac{(y_i - \alpha - \beta x_i)^2}{\eta} + (x_i - X_i)^2 \right]
Where:
𝑦𝑖 : observed dependent variable𝑥𝑖 : true (unknown) value of the independent variable𝑋𝑖 : observed value of the independent variable\alpha,\beta : regression coefficients (intercept and slope)\eta : weighting factor between Y and X errors
And the weighting factor
\eta = \frac{\sigma_\xi^2}{\sigma_\mu^2}
Where:
\sigma_\xi^2 : variance of error in the dependent variable (Y-axis)\sigma_\mu^2 : variance of error in the independent variable (X-axis)
Implementation in SciPy
SciPy provides the scipy.odr module to implement ODR using the ODRPACK library, a well-established FORTRAN-77 based package. SciPy wraps this functionality in an object-oriented interface for ease of use.
Step-by-Step Approach
- Import required libraries
- Create input data arrays (feature, target)
- Define a model function (e.g., linear)
- Use odr.Model() to wrap the model function
- Wrap data using odr.Data()
- Create and configure odr.ODR() instance
- Run the regression using .run()
- Display results with .pprint()
import numpy as np
import matplotlib.pyplot as plt
from scipy import odr
x = np.arange(1, 11)
np.random.shuffle(x)
y = np.array([0.65, -0.75, 0.90, -0.5, 0.14,
0.84, 0.99, -0.95, 0.41, -0.28])
def model_fn(p, x):
m, c = p
return m * x + c
model = odr.Model(model_fn)
data = odr.Data(x, y)
odr_run = odr.ODR(data, model, beta0=[0.2, 1.0])
res = odr_run.run()
res.pprint()
Output
Beta: [ 0.11545417 -0.48999795]
Beta Std Error: [0.07475684 0.46382517]
Beta Covariance: [[ 0.01228991 -0.06759452]
[-0.06759452 0.4731028 ]]
Residual Variance: 0.45472947791705537
Inverse Condition #: 0.06923218954368635
Reason(s) for Halting:
Sum of squares convergence