From 9ec50c0a122715fc8dbd3c68695fb8dc3055f335 Mon Sep 17 00:00:00 2001 From: Andrew Bringaze Linux Foundation Date: Thu, 26 Sep 2024 14:50:18 -0500 Subject: [PATCH] fix v4 --- ...d => 2024-09-26-pytorch-native-architecture-optimization.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename _posts/{2024-09-25-pytorch-native-architecture-optimization.md => 2024-09-26-pytorch-native-architecture-optimization.md} (97%) diff --git a/_posts/2024-09-25-pytorch-native-architecture-optimization.md b/_posts/2024-09-26-pytorch-native-architecture-optimization.md similarity index 97% rename from _posts/2024-09-25-pytorch-native-architecture-optimization.md rename to _posts/2024-09-26-pytorch-native-architecture-optimization.md index 1f219a49710d..fcf5122e970e 100644 --- a/_posts/2024-09-25-pytorch-native-architecture-optimization.md +++ b/_posts/2024-09-26-pytorch-native-architecture-optimization.md @@ -72,7 +72,7 @@ But also can do things like quantize weights to int4 and the kv cache to int8 to Post training quantization, especially at less than 4 bit can suffer from serious accuracy degradations. Using [Quantization Aware Training](https://pytorch.org/blog/quantization-aware-training/) (QAT) we’ve managed to recover up to 96% of the accuracy degradation on hellaswag. We’ve integrated this as an end to end recipe in torchtune with a minimal [tutorial](https://github.com/pytorch/ao/tree/main/torchao/quantization/prototype/qat) -![](/assets/assets/Figure_3.png){:style="width:100%"} +![](/assets/images/Figure_3.png){:style="width:100%"} # Training