BoxMOT: Pluggable SOTA multi-object tracking modules for segmentation, object detection and pose estimation models

🚀 Key Features

Pluggable Architecture
Easily swap in/out SOTA multi-object trackers.
Universal Model Support
Integrate with any segmentation, object-detection and pose-estimation models that outputs bounding boxes
Benchmark-Ready
Local evaluation pipelines for MOT17, MOT20, and DanceTrack ablation datasets with "official" ablation detectors
Performance Modes
- Motion-only: for lightweight, CPU-efficient, high-FPS performance
- Motion + Appearance: Combines motion cues with appearance embeddings (CLIPReID, LightMBN, OSNet) to maximize identity consistency and accuracy at a higher computational cost
Reusable Detections & Embeddings
Save once, run evaluations with no redundant preprocessing lightning fast.

📊 Benchmark Results (MOT17 ablation split)

Tracker	Status	HOTA↑	MOTA↑	IDF1↑	FPS
botsort	✅	69.418	78.232	81.812	46
boosttrack	✅	69.254	75.921	83.205	25
strongsort	✅	68.05	76.185	80.763	17
deepocsort	✅	67.796	75.868	80.514	12
bytetrack	✅	67.68	78.039	79.157	1265
hybridsort	✅	67.39	74.127	79.105	25
ocsort	✅	66.441	74.548	77.899	1483

_{NOTES: Evaluation was conducted on the second half of the MOT17 training set, as the validation set is not publicly available and the ablation detector was trained on the first half. We employed pre-generated detections and embeddings. Each tracker was configured using the default parameters from their official repositories.}

🔧 Installation

Install the boxmot package, including all requirements, in a Python>=3.9 environment:

pip install boxmot

If you want to contribute to this package check how to contribute here

💻 CLI

BoxMOT provides a unified CLI with a simple syntax:

boxmot MODE DETECTOR REID TRACKER ARGS

Where:
  MODE      (required) one of [track, eval, tune, generate, export]
  DETECTOR  (optional) YOLO model like yolov8n, yolov9c, yolo11m, yolox_x
  REID      (optional) ReID model like osnet_x0_25_msmt17, mobilenetv2_x1_4
  TRACKER   (optional) one of [deepocsort, botsort, bytetrack, strongsort, ocsort, hybridsort, boosttrack]
  ARGS      (optional) 'arg=value' pairs that override defaults

Quick Examples:

# Track with webcam, save results, show basic results
boxmot track yolov8n osnet_x0_25_msmt17 deepocsort --source 0 --show --save

# Track a video file, save results, show trajectories + lost tracks
boxmot track yolov8n osnet_x0_25_msmt17 botsort --source video.mp4 --save --show-trajectories --show-lost

# Evaluate on MOT dataset
boxmot eval yolox_x_MOT17_ablation lmbn_n_duke botsort --source MOT17-ablation

# Tune ocsort's hyperparameters for dancetrack
boxmot tune yolox_x_dancetrack_ablation lmbn_n_duke ocsort --source dancetrack-ablation --n-trials 10

# Export ReID model with dynamic sized input
boxmot export --weights osnet_x0_25_msmt17.pt --include onnx --include engine dynamic

🐍 PYTHON

Seamlessly integrate BoxMOT directly into your Python MOT applications with your custom model.

import cv2
import torch
import numpy as np
from pathlib import Path
from boxmot import BoostTrack
from torchvision.models.detection import (
    fasterrcnn_resnet50_fpn_v2,
    FasterRCNN_ResNet50_FPN_V2_Weights as Weights
)

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Load detector with pretrained weights and preprocessing transforms
weights = Weights.DEFAULT
detector = fasterrcnn_resnet50_fpn_v2(weights=weights, box_score_thresh=0.5)
detector.to(device).eval()
transform = weights.transforms()

# Initialize tracker
tracker = BoostTrack(reid_weights=Path('osnet_x0_25_msmt17.pt'), device=device, half=False)

# Start video capture
cap = cv2.VideoCapture(0)

with torch.inference_mode():
    while True:
        success, frame = cap.read()
        if not success:
            break

        # Convert frame to RGB and prepare for detector
        rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        tensor = torch.from_numpy(rgb).permute(2, 0, 1).to(torch.uint8)
        input_tensor = transform(tensor).to(device)

        # Run detection
        output = detector([input_tensor])[0]
        scores = output['scores'].cpu().numpy()
        keep = scores >= 0.5

        # Prepare detections for tracking
        boxes = output['boxes'][keep].cpu().numpy()
        labels = output['labels'][keep].cpu().numpy()
        filtered_scores = scores[keep]
        detections = np.concatenate([boxes, filtered_scores[:, None], labels[:, None]], axis=1)

        # Update tracker and draw results
        #   INPUT:  M X (x, y, x, y, conf, cls)
        #   OUTPUT: M X (x, y, x, y, id, conf, cls, ind)
        res = tracker.update(detections, frame)
        tracker.plot_results(frame, show_trajectories=True)

        # Show output
        cv2.imshow('BoXMOT + Torchvision', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

# Clean up
cap.release()
cv2.destroyAllWindows()

📝 Code Examples & Tutorials

Tracking

# Different detector models
boxmot track rf-detr-base                        # RF-DETR
boxmot track yolox_s                             # YOLOX  
boxmot track yolo12n                             # YOLO12
boxmot track yolo11n                             # YOLO11
boxmot track yolov10n                            # YOLOv10
boxmot track yolov9c                             # YOLOv9
boxmot track yolov8n                             # YOLOv8 bboxes only
boxmot track yolov8n-seg                         # YOLOv8 + segmentation masks
boxmot track yolov8n-pose                        # YOLOv8 + pose estimation

Tracking methods

boxmot track yolov8n osnet_x0_25_msmt17 deepocsort
boxmot track yolov8n osnet_x0_25_msmt17 strongsort
boxmot track yolov8n osnet_x0_25_msmt17 ocsort
boxmot track yolov8n osnet_x0_25_msmt17 bytetrack
boxmot track yolov8n osnet_x0_25_msmt17 botsort
boxmot track yolov8n osnet_x0_25_msmt17 boosttrack
boxmot track yolov8n osnet_x0_25_msmt17 hybridsort

Tracking sources

Tracking can be run on most video formats

boxmot track yolov8n --source 0                               # webcam
boxmot track yolov8n --source img.jpg                         # image
boxmot track yolov8n --source vid.mp4                         # video
boxmot track yolov8n --source path/                           # directory
boxmot track yolov8n --source path/*.jpg                      # glob
boxmot track yolov8n --source 'https://youtu.be/Zgi9g1ksQHc'  # YouTube
boxmot track yolov8n --source 'rtsp://example.com/media.mp4'  # RTSP, RTMP, HTTP stream

Select ReID model

Some tracking methods combine appearance description and motion in the process of tracking. For those which use appearance, you can choose a ReID model based on your needs from this ReID model zoo. These models can be further optimized for your needs by the export command.

boxmot track yolov8n lmbn_n_cuhk03_d botsort --source 0           # lightweight
boxmot track yolov8n osnet_x0_25_market1501 botsort --source 0
boxmot track yolov8n mobilenetv2_x1_4_msmt17 botsort --source 0
boxmot track yolov8n resnet50_msmt17 botsort --source 0
boxmot track yolov8n osnet_x1_0_msmt17 botsort --source 0
boxmot track yolov8n clip_market1501 botsort --source 0           # heavy
boxmot track yolov8n clip_vehicleid botsort --source 0

Filter tracked classes

By default the tracker tracks all MS COCO classes.

If you want to track a subset of the classes that your model predicts, add their corresponding index after the classes flag:

boxmot track yolov8s --source 0 --classes 16 17  # Track cats and dogs only

Here is a list of all the possible objects that a YOLOv8 model trained on MS COCO can detect. Notice that the indexing for the classes in this repo starts at zero

Evaluation

Evaluate a combination of detector, tracking method and ReID model on standard MOT dataset or your custom one:

# reproduce MOT17 README results
boxmot eval yolox_x_MOT17_ablation lmbn_n_duke boosttrack --source MOT17-ablation --verbose 
# MOT20 results
boxmot eval yolox_x_MOT20_ablation lmbn_n_duke boosttrack --source MOT20-ablation --verbose 
# DanceTrack results
boxmot eval yolox_x_dancetrack_ablation lmbn_n_duke boosttrack --source dancetrack-ablation --verbose 
# metrics on custom dataset
boxmot eval yolov8n osnet_x0_25_msmt17 deepocsort --source ./assets/MOT17-mini/train --verbose

Add --gsi to your command for postprocessing the MOT results by Gaussian smoothed interpolation. Detections and embeddings are stored for the selected YOLO and ReID model respectively. They can then be loaded into any tracking algorithm, avoiding the overhead of repeatedly generating this data.

Hyperparameter Tuning

We use a fast and elitist multiobjective genetic algorithm for tracker hyperparameter tuning. By default the objectives are: HOTA, MOTA, IDF1.

# Generate detections and embeddings (saves under ./runs/dets_n_embs)
boxmot generate yolov8n osnet_x0_25_msmt17 --source ./assets/MOT17-mini/train

# Tune parameters for specified tracking method
boxmot tune --yolo-model yolov8n.pt --reid-model osnet_x0_25_msmt17.pt --n-trials 9 --tracking-method botsort --source ./assets/MOT17-mini/train

The set of hyperparameters leading to the best HOTA result are written to the tracker's config file.

Export

We support ReID model export to ONNX, OpenVINO, TorchScript and TensorRT:

# export to ONNX
boxmot export --weights osnet_x0_25_msmt17.pt --include onnx --device cpu
# export to OpenVINO
boxmot export --weights osnet_x0_25_msmt17.pt --include openvino --device cpu
# export to TensorRT with dynamic input
boxmot export --weights osnet_x0_25_msmt17.pt --include engine --device 0 --dynamic

Example Description	Notebook
Torchvision bounding box tracking with BoxMOT
Torchvision pose tracking with BoxMOT
Torchvision segmentation tracking with BoxMOT

Contributors

Contact

For BoxMOT bugs and feature requests please visit GitHub Issues. For business inquiries or professional support requests please send an email to: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 3,739 Commits
.github		.github
assets		assets
boxmot		boxmot
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

BoxMOT: Pluggable SOTA multi-object tracking modules for segmentation, object detection and pose estimation models

🚀 Key Features

📊 Benchmark Results (MOT17 ablation split)

🔧 Installation

💻 CLI

🐍 PYTHON

📝 Code Examples & Tutorials

Contributors

Contact

About

Uh oh!

Releases 136

Sponsor this project

Uh oh!

Packages

Used by 106

Contributors 45

Languages

Uh oh!

License

mikel-brostrom/boxmot

Folders and files

Latest commit

History

Repository files navigation

BoxMOT: Pluggable SOTA multi-object tracking modules for segmentation, object detection and pose estimation models

🚀 Key Features

📊 Benchmark Results (MOT17 ablation split)

🔧 Installation

💻 CLI

🐍 PYTHON

📝 Code Examples & Tutorials

Contributors

Contact

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 136

Sponsor this project

Uh oh!

Packages 0

Used by 106

Contributors 45

Languages

Packages