Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: GraphAlg/DeepSpeedExamples
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: deepspeedai/DeepSpeedExamples
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 17 commits
  • 252 files changed
  • 19 contributors

Commits on Apr 16, 2025

  1. Configuration menu
    Copy the full SHA
    7b34e07 View commit details
    Browse the repository at this point in the history
  2. Add example of DeepCompile (deepspeedai#967)

    * import files for deepcompile benchmark
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * add figures
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * add figures
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * update document
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * fix links to images
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * add images
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * specify deepspeed version
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    ---------
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    Co-authored-by: Olatunji Ruwase <[email protected]>
    tohtana and tjruwase authored Apr 16, 2025
    Configuration menu
    Copy the full SHA
    b76c7cc View commit details
    Browse the repository at this point in the history

Commits on Apr 17, 2025

  1. fix links (deepspeedai#970)

    tohtana authored Apr 17, 2025
    Configuration menu
    Copy the full SHA
    93ebac3 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2025

  1. Update description of versions for deepcompile (deepspeedai#971)

    * update description of versions for deepcompile
    
    * Update to match specific tag name
    
    Signed-off-by: Logan Adams <[email protected]>
    
    ---------
    
    Signed-off-by: Logan Adams <[email protected]>
    Co-authored-by: Logan Adams <[email protected]>
    tohtana and loadams authored Apr 18, 2025
    Configuration menu
    Copy the full SHA
    ce39bf0 View commit details
    Browse the repository at this point in the history

Commits on Apr 20, 2025

  1. Fix DeepCompile benchmark script (deepspeedai#973)

    * update description of versions for deepcompile
    
    * fix deepcompile benchmark script
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * fix benchmark for z1
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    * add options for deepcompile bench
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    
    ---------
    
    Signed-off-by: Masahiro Tanaka <[email protected]>
    tohtana authored Apr 20, 2025
    Configuration menu
    Copy the full SHA
    65bc536 View commit details
    Browse the repository at this point in the history

Commits on May 23, 2025

  1. Add example for Deepspeed-AutoTP (deepspeedai#964)

    * update tp example
    
    Signed-off-by: inkcherry <[email protected]>
    
    * update
    
    Signed-off-by: inkcherry <[email protected]>
    
    * add length bench file
    
    Signed-off-by: inkcherry <[email protected]>
    
    ---------
    
    Signed-off-by: inkcherry <[email protected]>
    Co-authored-by: Hongwei Chen <[email protected]>
    inkcherry and hwchen2017 authored May 23, 2025
    Configuration menu
    Copy the full SHA
    bd47e5b View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2025

  1. Configuration menu
    Copy the full SHA
    86aeab2 View commit details
    Browse the repository at this point in the history
  2. DeepNVMe update (deepspeedai#966)

    * Fast model checkpointing
    
    * Support both legacy and serialized formats
    
    * Add io_buffer_mb option
    
    * Bug fix
    
    * Force flush
    
    * More model options; Refactor common codes
    
    * --gpu option
    
    * --half and more flexible options
    
    * Add deepspeed.save_checkpoint()
    
    * Free ds memory
    
    * Improve repro
    
    * Double I/O buffer (deepspeedai#56)
    
    * Double I/O buffer (deepspeedai#60)
    
    * Add checkpoint comparison (deepspeedai#62)
    
    * Add checkpoint comparison
    
    * Corrected a typo
    
    Co-authored-by: Yang Li <[email protected]>
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * Perf statistics for save_checkpoint (deepspeedai#64)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * add logs for a100-80
    
    * add torch* error log with half flag but without fused flag
    
    * log for error
    
    * local rank arg
    
    * Handle local_rank arg (deepspeedai#78)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * local rank arg
    
    * Single writer option
    
    * Single writer option (deepspeedai#79)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * local rank arg
    
    * Single writer option
    
    * Allow missing folder
    
    * DP writer refactor
    
    * Update for DS; Add GDS
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * Integrate GDS into deepspeed_model_save
    
    * Rebase fast persist (deepspeedai#184)
    
    * Fast model checkpointing
    
    * Support both legacy and serialized formats
    
    * Add io_buffer_mb option
    
    * Bug fix
    
    * Force flush
    
    * More model options; Refactor common codes
    
    * --gpu option
    
    * --half and more flexible options
    
    * Add deepspeed.save_checkpoint()
    
    * Free ds memory
    
    * Improve repro
    
    * Double I/O buffer (deepspeedai#56)
    
    * Double I/O buffer (deepspeedai#60)
    
    * Add checkpoint comparison (deepspeedai#62)
    
    * Add checkpoint comparison
    
    * Corrected a typo
    
    Co-authored-by: Yang Li <[email protected]>
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * Perf statistics for save_checkpoint (deepspeedai#64)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * add logs for a100-80
    
    * add torch* error log with half flag but without fused flag
    
    * log for error
    
    * local rank arg
    
    * Handle local_rank arg (deepspeedai#78)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * local rank arg
    
    * Single writer option
    
    * Single writer option (deepspeedai#79)
    
    * save_checkpoint perf monitoring
    
    * Disable checkpoint save on exit
    
    * local rank arg
    
    * Single writer option
    
    * Allow missing folder
    
    * DP writer refactor
    
    * Update for DS; Add GDS
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * Integrate GDS into deepspeed_model_save
    
    ---------
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    Co-authored-by: jerryyangli <[email protected]>
    Co-authored-by: Yang Li <[email protected]>
    Co-authored-by: GuanhuaWang <[email protected]>
    
    * Move folder
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * Remove folder
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * More cleanup
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * torch changes
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * sglang+zero_inference
    
    * Remove file
    
    * Add offload configs
    
    * Add pin_memory
    
    * Cleanup scripts
    
    * SGLang README
    
    * Remove file
    
    ---------
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    Co-authored-by: jerryyangli <[email protected]>
    Co-authored-by: Yang Li <[email protected]>
    Co-authored-by: GuanhuaWang <[email protected]>
    Co-authored-by: Logan Adams <[email protected]>
    Co-authored-by: Hongwei Chen <[email protected]>
    Co-authored-by: Zhipeng Wang <[email protected]>
    7 people authored Jun 9, 2025
    Configuration menu
    Copy the full SHA
    207c93c View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2025

  1. Update domino example (deepspeedai#976)

    * remove files
    
    Signed-off-by: Hongwei Chen <[email protected]>
    
    * Update domino example
    
    Signed-off-by: Hongwei Chen <[email protected]>
    
    * apply review suggestions
    
    Signed-off-by: Hongwei Chen <[email protected]>
    
    ---------
    
    Signed-off-by: Hongwei Chen <[email protected]>
    hwchen2017 authored Jun 12, 2025
    Configuration menu
    Copy the full SHA
    b018de1 View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2025

  1. Simplify and add README (deepspeedai#978)

    Signed-off-by: Olatunji Ruwase <[email protected]>
    tjruwase authored Jun 18, 2025
    Configuration menu
    Copy the full SHA
    28a984e View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2025

  1. Add file extension (deepspeedai#980)

    Signed-off-by: Hongwei Chen <[email protected]>
    hwchen2017 authored Jun 21, 2025
    Configuration menu
    Copy the full SHA
    b99d653 View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2025

  1. Configuration menu
    Copy the full SHA
    4579df3 View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2025

  1. fix init weights issue for critic/reward model (deepspeedai#983)

    * Add file extension (deepspeedai#980)
    
    Signed-off-by: Hongwei Chen <[email protected]>
    Signed-off-by: jouw <[email protected]>
    
    * fix init weights issue for critic/reward model
    
    Signed-off-by: jouw <[email protected]>
    
    * Update submodule link to reflect https style (deepspeedai#981)
    
    Signed-off-by: raviguptaamd <[email protected]>
    Signed-off-by: jouw <[email protected]>
    
    * fix formatting issue
    
    Signed-off-by: jouw <[email protected]>
    
    ---------
    
    Signed-off-by: Hongwei Chen <[email protected]>
    Signed-off-by: jouw <[email protected]>
    Signed-off-by: raviguptaamd <[email protected]>
    Co-authored-by: Hongwei Chen <[email protected]>
    Co-authored-by: raviguptaamd <[email protected]>
    3 people authored Jul 8, 2025
    Configuration menu
    Copy the full SHA
    3d83278 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2025

  1. Add Benchmarking and Fine-Tuning Support for ZenFlow (deepspeedai#982)

    * Add benchmark scripts and README for ZenFlow
    
    - Introduced `zf_benchmark.py` for model offloading benchmarking with DeepSpeed.
    - Added `output_table.py` to parse and display benchmark results in a tabular format.
    - Created `run_benchmark.sh` to automate benchmark runs with various configurations.
    
    Signed-off-by: Tingfeng Lan <[email protected]>
    
    * Add Llama-2 fine-tuning scripts and configuration for ZenFlow
    
    - Introduced `finetune_llama.py` for fine-tuning the Llama-2 model using DeepSpeed and ZenFlow.
    - Added `finetune_llama.sh` for automated training setup with environment variables and DeepSpeed command.
    - Added `zf_config.json` example for DeepSpeed configuration with ZenFlow optimizations.
    
    Signed-off-by: Tingfeng Lan <[email protected]>
    Co-authored-by: Yusen Wu <[email protected]>
    
    * Add explanation tips for interpreting benchmark results in README
    
    Signed-off-by: Tingfeng Lan <[email protected]>
    
    * Add guidance on step/latency interpretation
    
    Signed-off-by: Tingfeng Lan <[email protected]>
    
    ---------
    
    Signed-off-by: Tingfeng Lan <[email protected]>
    Co-authored-by: Yusen Wu <[email protected]>
    Antlera and JoshWoo2003 authored Aug 16, 2025
    Configuration menu
    Copy the full SHA
    b4385e5 View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2025

  1. Configuration menu
    Copy the full SHA
    01f520e View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2025

  1. Superoffload examples (deepspeedai#990)

    * feat: add examples for superoffload
    
    * fix: typo
    
    * fix: remove hardcoded GPU bind
    
    * feat: add requirement for superoffload
    xylian86 authored Sep 29, 2025
    Configuration menu
    Copy the full SHA
    ad2a4bd View commit details
    Browse the repository at this point in the history

Commits on Oct 15, 2025

  1. DeepNVMe benchmarks (deepspeedai#991)

    * ds_io sweep scripts
    
    * Use accelerator pin memory
    
    * Credit
    
    * Add README
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    * Add README
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    
    ---------
    
    Signed-off-by: Olatunji Ruwase <[email protected]>
    Co-authored-by: Olatunji Ruwase <[email protected]>
    sfc-gh-truwase and tjruwase authored Oct 15, 2025
    Configuration menu
    Copy the full SHA
    e676aa3 View commit details
    Browse the repository at this point in the history
Loading