NVIDIA / TensorRT-Model-Optimizer Public

Notifications You must be signed in to change notification settings
Fork 177
Star 1.5k

Code
Issues 121
Pull requests 38
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Pull requests: NVIDIA/TensorRT-Model-Optimizer

Labels 8 Milestones 0

New pull request New

38 Open 165 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Fix ONNX FP8 scaling

#446 opened Oct 17, 2025 by Darth-Kronos

Loading…

Bump trtllm to 1.2.0rc0.post1 and pytorch to 25.08 for cuda 13

#445 opened Oct 17, 2025 by kevalmorabia97

Loading…

1 task

Add SD3.5-medium quantization support in ModelOpt Diffusers example

#444 opened Oct 17, 2025 by vishalpandya1990

Loading…

[OMNIML-2673]Create an example for running diffusion models using auto deploy

#443 opened Oct 16, 2025 by ajrasane • Draft

[POC] registry based interface to TensorQuantizer

#437 opened Oct 14, 2025 by realAsma • Draft

[Autocast] Add low precision autocasting support for Resize op

#436 opened Oct 14, 2025 by aboubezari

Loading…

[OMNIML-2857] Support the DeepSeek V3.2 model

#435 opened Oct 14, 2025 by cjluo-nv

Loading…

Cleanup mixed precision and gather node layer info mapping

#434 opened Oct 14, 2025 by ynankani

Loading…

Add example for multinode calibration using FSDP2

#432 opened Oct 13, 2025 by sugunav14

Loading…

2 of 5 tasks

Fix megatron distributed checkpoint metadata pass through

#431 opened Oct 13, 2025 by ChenhanYu

Loading…

Yeyu/debug paralllel draft

#429 opened Oct 13, 2025 by yeyu-nvidia

Loading…

[MOD1RCU9-11] add testcasefor example onnx_ptq

#427 opened Oct 13, 2025 by joe0731 • Draft

ONNX 1.19 compatibility fix for INT4 quantization

#423 opened Oct 10, 2025 by hthadicherla

Loading…

Pattern-based fusion for pre_quant_scale

#421 opened Oct 9, 2025 by meenchen • Draft

Ensure that the ONNX IR version is the max supported version (10)

#416 opened Oct 9, 2025 by gcunhase

Loading…

Update onnx ptq test to be single threaded and make it faster

#415 opened Oct 8, 2025 by ajrasane

Loading…

[4975376][5541172]perplexity and kl-divergence benchmark metrics

#411 opened Oct 8, 2025 by ynankani

Loading…

[New feature] Add Support For Sparse Attention

#408 opened Oct 7, 2025 by kaix-nv

Loading…

Added support for quantizing TEGroupedMLP for megatron-lm

#403 opened Oct 7, 2025 by kinjalpatel27

Loading…

Explicitly register real quant gemms

#402 opened Oct 6, 2025 by cjluo-nv

Loading…

Update ReadMe for torch_quant_to_onnx.py example

#395 opened Sep 30, 2025 by ajrasane

Loading…

EAGLE parallel draft with auto regression; kv cache in EAGLE training

#391 opened Sep 29, 2025 by yeyu-nvidia

Loading…

[5545101]: AutoCast: Add options to force include node/op in F16

#386 opened Sep 28, 2025 by galagam

Loading…

Support kv cache quantization for mcore using bmm_quantizers

#375 opened Sep 25, 2025 by kaix-nv

Loading…

megatron realquant FP8 WIP

#367 opened Sep 24, 2025 by cjluo-nv • Draft

Previous 1 2 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2025-09-17.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!