-
Notifications
You must be signed in to change notification settings - Fork 0
Comparing changes
Open a pull request
base repository: GraphAlg/DeepSpeedExamples
base: master
head repository: deepspeedai/DeepSpeedExamples
compare: master
- 17 commits
- 252 files changed
- 19 contributors
Commits on Apr 16, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 7b34e07 - Browse repository at this point
Copy the full SHA 7b34e07View commit details -
Add example of DeepCompile (deepspeedai#967)
* import files for deepcompile benchmark Signed-off-by: Masahiro Tanaka <[email protected]> * add figures Signed-off-by: Masahiro Tanaka <[email protected]> * add figures Signed-off-by: Masahiro Tanaka <[email protected]> * update document Signed-off-by: Masahiro Tanaka <[email protected]> * fix links to images Signed-off-by: Masahiro Tanaka <[email protected]> * add images Signed-off-by: Masahiro Tanaka <[email protected]> * specify deepspeed version Signed-off-by: Masahiro Tanaka <[email protected]> --------- Signed-off-by: Masahiro Tanaka <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b76c7cc - Browse repository at this point
Copy the full SHA b76c7ccView commit details
Commits on Apr 17, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 93ebac3 - Browse repository at this point
Copy the full SHA 93ebac3View commit details
Commits on Apr 18, 2025
-
Update description of versions for deepcompile (deepspeedai#971)
* update description of versions for deepcompile * Update to match specific tag name Signed-off-by: Logan Adams <[email protected]> --------- Signed-off-by: Logan Adams <[email protected]> Co-authored-by: Logan Adams <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ce39bf0 - Browse repository at this point
Copy the full SHA ce39bf0View commit details
Commits on Apr 20, 2025
-
Fix DeepCompile benchmark script (deepspeedai#973)
* update description of versions for deepcompile * fix deepcompile benchmark script Signed-off-by: Masahiro Tanaka <[email protected]> * fix benchmark for z1 Signed-off-by: Masahiro Tanaka <[email protected]> * add options for deepcompile bench Signed-off-by: Masahiro Tanaka <[email protected]> --------- Signed-off-by: Masahiro Tanaka <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 65bc536 - Browse repository at this point
Copy the full SHA 65bc536View commit details
Commits on May 23, 2025
-
Add example for Deepspeed-AutoTP (deepspeedai#964)
* update tp example Signed-off-by: inkcherry <[email protected]> * update Signed-off-by: inkcherry <[email protected]> * add length bench file Signed-off-by: inkcherry <[email protected]> --------- Signed-off-by: inkcherry <[email protected]> Co-authored-by: Hongwei Chen <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for bd47e5b - Browse repository at this point
Copy the full SHA bd47e5bView commit details
Commits on Jun 9, 2025
-
fix: Fix: Correctly define choices as tuple for reward-model arg Fixes …
…deepspeedai#941 (deepspeedai#974) Signed-off-by: Vensenmu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 86aeab2 - Browse repository at this point
Copy the full SHA 86aeab2View commit details -
DeepNVMe update (deepspeedai#966)
* Fast model checkpointing * Support both legacy and serialized formats * Add io_buffer_mb option * Bug fix * Force flush * More model options; Refactor common codes * --gpu option * --half and more flexible options * Add deepspeed.save_checkpoint() * Free ds memory * Improve repro * Double I/O buffer (deepspeedai#56) * Double I/O buffer (deepspeedai#60) * Add checkpoint comparison (deepspeedai#62) * Add checkpoint comparison * Corrected a typo Co-authored-by: Yang Li <[email protected]> * save_checkpoint perf monitoring * Disable checkpoint save on exit * Perf statistics for save_checkpoint (deepspeedai#64) * save_checkpoint perf monitoring * Disable checkpoint save on exit * add logs for a100-80 * add torch* error log with half flag but without fused flag * log for error * local rank arg * Handle local_rank arg (deepspeedai#78) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Single writer option (deepspeedai#79) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Allow missing folder * DP writer refactor * Update for DS; Add GDS Signed-off-by: Olatunji Ruwase <[email protected]> * Integrate GDS into deepspeed_model_save * Rebase fast persist (deepspeedai#184) * Fast model checkpointing * Support both legacy and serialized formats * Add io_buffer_mb option * Bug fix * Force flush * More model options; Refactor common codes * --gpu option * --half and more flexible options * Add deepspeed.save_checkpoint() * Free ds memory * Improve repro * Double I/O buffer (deepspeedai#56) * Double I/O buffer (deepspeedai#60) * Add checkpoint comparison (deepspeedai#62) * Add checkpoint comparison * Corrected a typo Co-authored-by: Yang Li <[email protected]> * save_checkpoint perf monitoring * Disable checkpoint save on exit * Perf statistics for save_checkpoint (deepspeedai#64) * save_checkpoint perf monitoring * Disable checkpoint save on exit * add logs for a100-80 * add torch* error log with half flag but without fused flag * log for error * local rank arg * Handle local_rank arg (deepspeedai#78) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Single writer option (deepspeedai#79) * save_checkpoint perf monitoring * Disable checkpoint save on exit * local rank arg * Single writer option * Allow missing folder * DP writer refactor * Update for DS; Add GDS Signed-off-by: Olatunji Ruwase <[email protected]> * Integrate GDS into deepspeed_model_save --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: jerryyangli <[email protected]> Co-authored-by: Yang Li <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> * Move folder Signed-off-by: Olatunji Ruwase <[email protected]> * Remove folder Signed-off-by: Olatunji Ruwase <[email protected]> * More cleanup Signed-off-by: Olatunji Ruwase <[email protected]> * torch changes Signed-off-by: Olatunji Ruwase <[email protected]> * sglang+zero_inference * Remove file * Add offload configs * Add pin_memory * Cleanup scripts * SGLang README * Remove file --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: jerryyangli <[email protected]> Co-authored-by: Yang Li <[email protected]> Co-authored-by: GuanhuaWang <[email protected]> Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Co-authored-by: Zhipeng Wang <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 207c93c - Browse repository at this point
Copy the full SHA 207c93cView commit details
Commits on Jun 12, 2025
-
Update domino example (deepspeedai#976)
* remove files Signed-off-by: Hongwei Chen <[email protected]> * Update domino example Signed-off-by: Hongwei Chen <[email protected]> * apply review suggestions Signed-off-by: Hongwei Chen <[email protected]> --------- Signed-off-by: Hongwei Chen <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b018de1 - Browse repository at this point
Copy the full SHA b018de1View commit details
Commits on Jun 18, 2025
-
Simplify and add README (deepspeedai#978)
Signed-off-by: Olatunji Ruwase <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 28a984e - Browse repository at this point
Copy the full SHA 28a984eView commit details
Commits on Jun 21, 2025
-
Add file extension (deepspeedai#980)
Signed-off-by: Hongwei Chen <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b99d653 - Browse repository at this point
Copy the full SHA b99d653View commit details
Commits on Jul 4, 2025
-
Update submodule link to reflect https style (deepspeedai#981)
Signed-off-by: raviguptaamd <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4579df3 - Browse repository at this point
Copy the full SHA 4579df3View commit details
Commits on Jul 8, 2025
-
fix init weights issue for critic/reward model (deepspeedai#983)
* Add file extension (deepspeedai#980) Signed-off-by: Hongwei Chen <[email protected]> Signed-off-by: jouw <[email protected]> * fix init weights issue for critic/reward model Signed-off-by: jouw <[email protected]> * Update submodule link to reflect https style (deepspeedai#981) Signed-off-by: raviguptaamd <[email protected]> Signed-off-by: jouw <[email protected]> * fix formatting issue Signed-off-by: jouw <[email protected]> --------- Signed-off-by: Hongwei Chen <[email protected]> Signed-off-by: jouw <[email protected]> Signed-off-by: raviguptaamd <[email protected]> Co-authored-by: Hongwei Chen <[email protected]> Co-authored-by: raviguptaamd <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3d83278 - Browse repository at this point
Copy the full SHA 3d83278View commit details
Commits on Aug 16, 2025
-
Add Benchmarking and Fine-Tuning Support for ZenFlow (deepspeedai#982)
* Add benchmark scripts and README for ZenFlow - Introduced `zf_benchmark.py` for model offloading benchmarking with DeepSpeed. - Added `output_table.py` to parse and display benchmark results in a tabular format. - Created `run_benchmark.sh` to automate benchmark runs with various configurations. Signed-off-by: Tingfeng Lan <[email protected]> * Add Llama-2 fine-tuning scripts and configuration for ZenFlow - Introduced `finetune_llama.py` for fine-tuning the Llama-2 model using DeepSpeed and ZenFlow. - Added `finetune_llama.sh` for automated training setup with environment variables and DeepSpeed command. - Added `zf_config.json` example for DeepSpeed configuration with ZenFlow optimizations. Signed-off-by: Tingfeng Lan <[email protected]> Co-authored-by: Yusen Wu <[email protected]> * Add explanation tips for interpreting benchmark results in README Signed-off-by: Tingfeng Lan <[email protected]> * Add guidance on step/latency interpretation Signed-off-by: Tingfeng Lan <[email protected]> --------- Signed-off-by: Tingfeng Lan <[email protected]> Co-authored-by: Yusen Wu <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for b4385e5 - Browse repository at this point
Copy the full SHA b4385e5View commit details
Commits on Aug 20, 2025
-
Fix README for LLaMA-2 fine-tuning with ZenFlow. (deepspeedai#987)
Signed-off-by: Tingfeng Lan <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 01f520e - Browse repository at this point
Copy the full SHA 01f520eView commit details
Commits on Sep 29, 2025
-
Superoffload examples (deepspeedai#990)
* feat: add examples for superoffload * fix: typo * fix: remove hardcoded GPU bind * feat: add requirement for superoffload
Configuration menu - View commit details
-
Copy full SHA for ad2a4bd - Browse repository at this point
Copy the full SHA ad2a4bdView commit details
Commits on Oct 15, 2025
-
DeepNVMe benchmarks (deepspeedai#991)
* ds_io sweep scripts * Use accelerator pin memory * Credit * Add README Signed-off-by: Olatunji Ruwase <[email protected]> * Add README Signed-off-by: Olatunji Ruwase <[email protected]> --------- Signed-off-by: Olatunji Ruwase <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e676aa3 - Browse repository at this point
Copy the full SHA e676aa3View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff master...master