1
- Introduction to Torchrec
2
- ====================================================
1
+ Introduction to TorchRec
2
+ ========================
3
3
4
4
.. tip ::
5
5
To get the most of this tutorial, we suggest using this
@@ -12,34 +12,34 @@ AI’s `Deep learning recommendation
12
12
model <https://arxiv.org/abs/1906.00091> `__, or DLRM. As the number of
13
13
entities grow, the size of the embedding tables can exceed a single
14
14
GPU’s memory. A common practice is to shard the embedding table across
15
- devices, a type of model parallelism. To that end, ** torchRec introduces
15
+ devices, a type of model parallelism. To that end, TorchRec introduces
16
16
its primary API
17
- called ** |DistributedModelParallel |_ ** ,
18
- or DMP. Like pytorch ’s DistributedDataParallel, DMP wraps a model to
19
- enable distributed training. **
17
+ called |DistributedModelParallel |_ ,
18
+ or DMP. Like PyTorch ’s DistributedDataParallel, DMP wraps a model to
19
+ enable distributed training.
20
20
21
- ** Installation **
22
- --------------------
21
+ Installation
22
+ ------------
23
23
24
24
Requirements:
25
25
- python >= 3.7
26
26
27
- We highly recommend CUDA when using torchRec . If using CUDA:
27
+ We highly recommend CUDA when using TorchRec . If using CUDA:
28
28
- cuda >= 11.0
29
29
30
30
31
31
.. code :: shell
32
32
33
33
# install pytorch with cudatoolkit 11.3
34
34
conda install pytorch cudatoolkit=11.3 -c pytorch-nightly -y
35
- # install torchrec
35
+ # install TorchTec
36
36
pip3 install torchrec-nightly
37
37
38
38
39
- ** Overview **
40
- ------------
39
+ Overview
40
+ --------
41
41
42
- This tutorial will cover three pieces of torchRec - the ``nn.module `` |EmbeddingBagCollection |_, the |DistributedModelParallel |_ API, and
42
+ This tutorial will cover three pieces of TorchRec - the ``nn.module `` |EmbeddingBagCollection |_, the |DistributedModelParallel |_ API, and
43
43
the datastructure |KeyedJaggedTensor |_.
44
44
45
45
75
75
From EmbeddingBag to EmbeddingBagCollection
76
76
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
77
77
78
- Pytorch represents embeddings through |torch.nn.Embedding |_ and |torch.nn.EmbeddingBag |_.
78
+ PyTorch represents embeddings through |torch.nn.Embedding |_ and |torch.nn.EmbeddingBag |_.
79
79
EmbeddingBag is a pooled version of Embedding.
80
80
81
81
TorchRec extends these modules by creating collections of embeddings. We
@@ -121,7 +121,7 @@ Now, we’re ready to wrap our model with |DistributedModelParallel|_ (DMP). Ins
121
121
embedding table on the appropriate device(s).
122
122
123
123
In this toy example, since we have two EmbeddingTables and one GPU,
124
- torchRec will place both on the single GPU.
124
+ TorchRec will place both on the single GPU.
125
125
126
126
.. code :: python
127
127
@@ -161,7 +161,7 @@ Representing minibatches with KeyedJaggedTensor
161
161
162
162
We need an efficient representation of multiple examples of an arbitrary
163
163
number of entity IDs per feature per example. In order to enable this
164
- “jagged” representation, we use the torchRec datastructure
164
+ “jagged” representation, we use the TorchRec datastructure
165
165
|KeyedJaggedTensor |_ (KJT).
166
166
167
167
Let’s take a look at **how to lookup a collection of two embedding
0 commit comments