Parallelism

Parallelism strategies help speed up diffusion transformers by distributing computations across multiple devices, allowing for faster inference/training times. Refer to the Distributed inferece guide to learn more.

ParallelConfig

class diffusers.ParallelConfig

< source >

( context_parallel_config: typing.Optional[diffusers.models._modeling_parallel.ContextParallelConfig] = None _rank: int = None _world_size: int = None _device: device = None _cp_mesh: DeviceMesh = None )

Parameters

context_parallel_config (ContextParallelConfig, optional) — Configuration for context parallelism.

Configuration for applying different parallelisms.

ContextParallelConfig

class diffusers.ContextParallelConfig

< source >

( ring_degree: typing.Optional[int] = None ulysses_degree: typing.Optional[int] = None convert_to_fp32: bool = True rotate_method: typing.Literal['allgather', 'alltoall'] = 'allgather' _rank: int = None _world_size: int = None _device: device = None _mesh: DeviceMesh = None _flattened_mesh: DeviceMesh = None _ring_mesh: DeviceMesh = None _ulysses_mesh: DeviceMesh = None _ring_local_rank: int = None _ulysses_local_rank: int = None )

Parameters

ring_degree (int, optional, defaults to 1) — Number of devices to use for ring attention within a context parallel region. Must be a divisor of the total number of devices in the context parallel mesh.
ulysses_degree (int, optional, defaults to 1) — Number of devices to use for ulysses attention within a context parallel region. Must be a divisor of the total number of devices in the context parallel mesh.
convert_to_fp32 (bool, optional, defaults to True) — Whether to convert output and LSE to float32 for ring attention numerical stability.
rotate_method (str, optional, defaults to "allgather") — Method to use for rotating key/value states across devices in ring attention. Currently, only "allgather" is supported.

Configuration for context parallelism.

diffusers.hooks.apply_context_parallel

< source >

( module: Module parallel_config: ContextParallelConfig plan: typing.Dict[str, typing.Dict[str, typing.Union[typing.Dict[typing.Union[str, int], typing.Union[diffusers.models._modeling_parallel.ContextParallelInput, typing.List[diffusers.models._modeling_parallel.ContextParallelInput], typing.Tuple[diffusers.models._modeling_parallel.ContextParallelInput, ...]]], diffusers.models._modeling_parallel.ContextParallelOutput, typing.List[diffusers.models._modeling_parallel.ContextParallelOutput], typing.Tuple[diffusers.models._modeling_parallel.ContextParallelOutput, ...]]]] )

Apply context parallel on a model.

Update on GitHub

Diffusers

Parallelism

ParallelConfig

class diffusers.ParallelConfig

ContextParallelConfig

class diffusers.ContextParallelConfig

diffusers.hooks.apply_context_parallel