diff --git a/README.md b/README.md
index 22b85b2..7306ced 100644
--- a/README.md
+++ b/README.md
@@ -5,6 +5,11 @@ _Additional models and pipelines for 🤗 Diffusers created by [Lambda Labs](htt
 - [Stable Diffusion Image Variations](#stable-diffusion-image-variations)
 - [Pokemon text to image](#pokemon-text-to-image)
 
+
+<p align="center">
+🦄 Other exciting ML projects at Lambda: <a href="https://news.lambdalabs.com/news/today">ML Times</a>, <a href="https://github.com/LambdaLabsML/distributed-training-guide/tree/main">Distributed Training Guide</a>, <a href="https://lambdalabsml.github.io/Open-Sora/introduction/">Text2Video</a>, <a href="https://lambdalabs.com/gpu-benchmarks">GPU Benchmark</a>.
+</p>
+
 ## Installation
 
 ```bash
@@ -125,10 +130,22 @@ cd lambda-diffusers/scripts
 make bench
 ```
 
+Currently `xformers` does not support H100. The "without xformers" results below are generated by running the benchmark with `--xformers no` (can be set in `scripts/Makefile`)
+
 ### Results
 
+With [xformers](https://github.com/facebookresearch/xformers), raw data can be found [here](./benchmarks/benchmark.csv).
 ![](./docs/pictures/sd_throughput.png)
 
+Without [xformers](https://github.com/facebookresearch/xformers), raw data can be found [here](./benchmarks/benchmark_no_xformers.csv).
+![](./docs/pictures/sd_throughput_noxformer.png)
+
+H100 MIG performance, raw data can be found [here](./benchmarks/benchmark_H100_MIG.csv).
+![](./docs/pictures/sd_throughput_mig.png)
+
+Cost analysis
+![](./docs/pictures/cost_analysis.png)
+
 ## Links
 
 - [Captioned Pokémon dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions)
diff --git a/benchmark.csv b/benchmarks/benchmark.csv
similarity index 100%
rename from benchmark.csv
rename to benchmarks/benchmark.csv
diff --git a/benchmarks/benchmark_H100_MIG.csv b/benchmarks/benchmark_H100_MIG.csv
new file mode 100644
index 0000000..87c70dd
--- /dev/null
+++ b/benchmarks/benchmark_H100_MIG.csv
@@ -0,0 +1,65 @@
+device,precision,autocast,xformers,runtime,n_samples,latency,memory,
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,1,1.73,7.7
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,1,1.06,3.46
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,2,2.66,9.79
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,2,1.73,4.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,4,4.47,18.49
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,4,2.63,8.91
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,8,8.16,23.86
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,8,4.97,12.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,16,15.98,42.38
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,16,9.61,29.01
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,32,32.04,80.51
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,32,19.07,55.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,1,2.3,7.74
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,1,1.52,3.45
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,2,3.95,9.48
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,2,2.42,4.57
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,4,7.12,18.2
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,4,4.17,8.9
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,8,13.91,23.75
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,8,7.91,12.49
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,16,15.73,29.01
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 4g.40gb,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,1,4.2,7.76
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,1,2.58,3.41
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,2,7.61,11.09
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,2,4.56,4.59
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,4,14.45,17.65
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,4,8.24,6.78
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,8,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,8,15.81,15.65
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 2g.20gb,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,1,9.17,7.76
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,1,5.39,3.47
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,2,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,2,9.29,4.63
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,4,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,4,17.4,6.8
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,8,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,8,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe MIG 1g.10gb,half,FALSE,FALSE,pytorch,128,-1,-1
\ No newline at end of file
diff --git a/benchmarks/benchmark_no_xformers.csv b/benchmarks/benchmark_no_xformers.csv
new file mode 100644
index 0000000..d578b6d
--- /dev/null
+++ b/benchmarks/benchmark_no_xformers.csv
@@ -0,0 +1,97 @@
+device,precision,autocast,xformers,runtime,n_samples,latency,memory,
+NVIDIA A10,single,FALSE,FALSE,pytorch,1,4.75,6.73
+NVIDIA A10,half,FALSE,FALSE,pytorch,1,2.71,3.43
+NVIDIA A10,single,FALSE,FALSE,pytorch,2,8.75,9
+NVIDIA A10,half,FALSE,FALSE,pytorch,2,4.99,5.53
+NVIDIA A10,single,FALSE,FALSE,pytorch,4,17.18,18.14
+NVIDIA A10,half,FALSE,FALSE,pytorch,4,9.65,6.84
+NVIDIA A10,single,FALSE,FALSE,pytorch,8,-1,-1
+NVIDIA A10,half,FALSE,FALSE,pytorch,8,18.58,12.66
+NVIDIA A10,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA A10,half,FALSE,FALSE,pytorch,16,36.32,20.64
+NVIDIA A10,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A10,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A10,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A10,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A10,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A10,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,1,1.72,7.76
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,1,1.18,3.41
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,2,3.03,9.04
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,2,1.88,5.53
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,4,5.53,18.04
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,4,3.35,6.74
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,8,10.95,23.85
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,8,6.28,12.6
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,16,12.57,20.58
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A100-SXM4-40GB,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A100-SXM4-40GB,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,1,1.99,7.76
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,1,1.5,3.45
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,2,3.52,11.11
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,2,2.3,4.53
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,4,6.31,13.98
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,4,4.04,8.91
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,8,12.21,23.91
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,8,7.59,12.75
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,16,-1,-1
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,16,14.54,21.24
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA A100-PCIE-40GB,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A100-PCIE-40GB,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,1,2.05,7.76
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,1,1.53,3.41
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,2,3.09,9.04
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,2,3.06,5.53
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,4,6.34,18.04
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,4,4.57,6.74
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,8,11.16,23.85
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,8,7.91,12.6
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,16,22.59,42.63
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,16,14.22,20.58
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,32,44.02,79.6
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,32,27.73,45.19
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,64,-1.0,-1.0
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,64,55.55,79.54
+NVIDIA A100 80GB PCIe,single,False,False,pytorch,128,-1.0,-1.0
+NVIDIA A100 80GB PCIe,half,False,False,pytorch,128,-1.0,-1.0
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,1,4.15,6.76
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,1,2.43,3.42
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,2,6,11.1
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,2,3.88,4.5
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,4,12.85,13.97
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,4,7.77,8.88
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,8,32.69,23.88
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,8,21.21,12.74
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,16,81.14,42.77
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,16,48.49,21.23
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,32,-1,-1
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA RTX A6000,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA RTX A6000,half,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,1,1.73,7.7
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,1,1.06,3.46
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,2,2.66,9.79
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,2,1.73,4.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,4,4.47,18.49
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,4,2.63,8.91
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,8,8.16,23.86
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,8,4.97,12.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,16,15.98,42.38
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,16,9.61,29.01
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,32,32.04,80.51
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,32,19.07,55.57
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,64,-1,-1
+NVIDIA H100 PCIe,single,FALSE,FALSE,pytorch,128,-1,-1
+NVIDIA H100 PCIe,half,FALSE,FALSE,pytorch,128,-1,-1
diff --git a/docs/benchmark-update.md b/docs/benchmark-update.md
index 9ef98a6..b383e01 100644
--- a/docs/benchmark-update.md
+++ b/docs/benchmark-update.md
@@ -16,7 +16,7 @@ Results will be written to `results.csv`, the benchmark will take different amou
 
 ## Results
 
-The current results for the benchmark are available in [`benchmark.csv`](../benchmark.csv). These results were run with Diffusers 0.11.0 and xformers using Ubuntu 20.04, Python 3.8, PyTorch 1.13, CUDA 11.8 ([NGC PyTorch container 22.11](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-22-11.html)).
+The current results for the benchmark are available in [`benchmark.csv`](../benchmarks/benchmark.csv). These results were run with Diffusers 0.11.0 and xformers using Ubuntu 20.04, Python 3.8, PyTorch 1.13, CUDA 11.8 ([NGC PyTorch container 22.11](https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-22-11.html)).
 
 xformers provides a significant boost in performance and memory consumption allowing large batch sizes to maximise utilisation of GPUs. Our best performance comes using NVIDIA A100-SXM4-40GB on [Lambda GPU cloud](https://cloud.lambdalabs.com), at the maximum batch size tested (128) at half precision we observe a throughput of 1.85 images/second when using DDIM 30 steps for sampling.
 
diff --git a/docs/pictures/cost_analysis.png b/docs/pictures/cost_analysis.png
new file mode 100644
index 0000000..2b5a473
Binary files /dev/null and b/docs/pictures/cost_analysis.png differ
diff --git a/docs/pictures/sd_throughput_mig.png b/docs/pictures/sd_throughput_mig.png
new file mode 100644
index 0000000..5e813a1
Binary files /dev/null and b/docs/pictures/sd_throughput_mig.png differ
diff --git a/docs/pictures/sd_throughput_noxformer.png b/docs/pictures/sd_throughput_noxformer.png
new file mode 100644
index 0000000..baae962
Binary files /dev/null and b/docs/pictures/sd_throughput_noxformer.png differ