Skip to content

Commit 71ab29d

Browse files
jeffrasamyam
andauthored
remove distributed requirement from model building (deepspeedai#31)
* remove distributed requirement from model building, this is needed for mpi/aml support for this model * Update modelingpreln.py Removed the print. It would print tons of times after removing the rank check otherwise Co-authored-by: Samyam Rajbhandari <[email protected]>
1 parent 9e2c34e commit 71ab29d

File tree

2 files changed

+3
-7
lines changed

2 files changed

+3
-7
lines changed

bing_bert/deepspeed_train.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -368,9 +368,6 @@ def prepare_optimizer_parameters(args, model):
368368

369369

370370
def prepare_model_optimizer(args):
371-
# Initialize torch distributed
372-
torch.distributed.init_process_group(backend="nccl")
373-
374371
# Loading Model
375372
model = BertMultiTask(args)
376373

bing_bert/nvidia/modelingpreln.py

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -739,10 +739,9 @@ def init_bert_weights(self, module):
739739
num_layers = self.config.num_hidden_layers
740740
std = self.config.initializer_range
741741
if hasattr(module, 'bert_output_layer'):
742-
if torch.distributed.get_rank() == 0:
743-
print("Accounting for accumulation on the residual path")
744-
std = self.config.initializer_range / math.sqrt(
745-
2.0 * num_layers)
742+
#print("Accounting for accumulation on the residual path")
743+
std = self.config.initializer_range / math.sqrt(
744+
2.0 * num_layers)
746745
module.weight.data.normal_(mean=0.0, std=std)
747746
elif isinstance(module, BertLayerNorm):
748747
module.bias.data.zero_()

0 commit comments

Comments
 (0)