Skip to content
This repository was archived by the owner on Aug 15, 2019. It is now read-only.

Add dropout. #1343

Merged
merged 64 commits into from
Apr 30, 2019
Merged

Add dropout. #1343

merged 64 commits into from
Apr 30, 2019

Conversation

VariableVasasMT
Copy link
Contributor

@VariableVasasMT VariableVasasMT commented Oct 25, 2018

Description

Added dropout to ops and moved getScalar to ops of core as well. Will add another pull request to implement non default noiseShape.

Fixes tensorflow/tfjs#117


For repository owners only:

Please remember to apply all applicable tags to your pull request.
Tags: FEATURE, BREAKING, BUG, PERF, DEV, DOC, SECURITY

For more info see: https://github.com/tensorflow/tfjs/blob/master/DEVELOPMENT.md


This change is Reviewable

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some unit tests for this?

Thanks for the awesome PR!

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat and @VariableVasasMT)


src/ops/dropout.ts, line 1 at r1 (raw file):

import {keep, tidy} from '../globals';

add the license at the top of this file


src/ops/dropout.ts, line 20 at r1 (raw file):

Previously, VariableVasasMT wrote…

moved this to core ops as well, I might move it to separate file

We dont need a scalar cache here at all.


src/ops/dropout.ts, line 33 at r1 (raw file):

/**
 * Sets entries in `x` to zero at random, while scaling the entire tensor.
 *

a code snippet would be great :)


src/ops/dropout.ts, line 41 at r1 (raw file):

 * @returns Result of the dropout operation.
 */

remove this empty line


src/ops/dropout.ts, line 44 at r1 (raw file):

function dropout_(
    x: Tensor, level: Scalar, noiseShape?: number[], seed?: number): Tensor {
  return tidy(() => {

remove this tidy, the op() wrapper does that


src/ops/dropout.ts, line 45 at r1 (raw file):

    x: Tensor, level: Scalar, noiseShape?: number[], seed?: number): Tensor {
  return tidy(() => {
    // TODO(cais): Switch to deeplearn.js implementation of dropout when it

remove this comment


src/ops/dropout.ts, line 56 at r1 (raw file):

    }
    let multiplier = step(
        add(neg(level) as Scalar, randomUniform(x.shape, 0, 1, 'float32')));

step(sub(randomUniform(x.shape, 0, 1, 'float32'), level))


src/ops/dropout.ts, line 59 at r1 (raw file):

    // Scale the kept elements, so the expected sum is unchanged.
    multiplier =
        mul(div(getScalar(1), sub(getScalar(1), level)) as Scalar, multiplier);

multipler = (div(1, sub(1, level)) as Scalar).mul(multipler)


src/ops/dropout.ts, line 60 at r1 (raw file):

    multiplier =
        mul(div(getScalar(1), sub(getScalar(1), level)) as Scalar, multiplier);
    return mul(x, multiplier);

return x.mul(multipler);


src/ops/dropout.ts, line 65 at r1 (raw file):

export const dropout = op({dropout_});
export const getScalar = op({getScalar_});

remove this

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT)


src/ops/dropout.ts, line 56 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

step(sub(randomUniform(x.shape, 0, 1, 'float32'), level))

I have been reading two days, but can't understand the reason for add(neg(level) or just level, could you please point me to a source to read about it, so that I can understand better.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be adding tests soon.

Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT)


src/ops/dropout.ts, line 56 at r1 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

I have been reading two days, but can't understand the reason for add(neg(level) or just level, could you please point me to a source to read about it, so that I can understand better.

I read http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf and found the random array generation uses Bernoulli random distribution, after reading the code here https://github.com/numpy/numpy/blob/v1.13.3/numpy/random/mtrand/distributions.c#L442, I know f(X=k) = (p^k)*((1-p)^(n-k)), I get randomUniform generates the numbers as Multiplier but I dont understand the sub() and step() application.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat)

@nsthorat nsthorat changed the title https://github.com/tensorflow/tfjs/issues/117: added dropout to core Add dropout. Oct 29, 2018
Copy link
Collaborator

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat)


src/ops/dropout.ts, line 56 at r1 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

I read http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf and found the random array generation uses Bernoulli random distribution, after reading the code here https://github.com/numpy/numpy/blob/v1.13.3/numpy/random/mtrand/distributions.c#L442, I know f(X=k) = (p^k)*((1-p)^(n-k)), I get randomUniform generates the numbers as Multiplier but I dont understand the sub() and step() application.

I think the math behind dropout is very simple. The level parameter is just a Bernoulli probability of an element of the tensor being set to 0. By doing sub with a [0, 1]-uniform distributed set of numbers and level, we make a level fraction of the elements negative, in expectation. Then the step sets the value of those negative elements to zero and the rest to one.

See the implementation in tensorflow (python) , which uses floor, but is mathematically equivalent:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py#L2323

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained


src/ops/dropout.ts, line 56 at r1 (raw file):

Previously, caisq (Shanqing Cai) wrote…

I think the math behind dropout is very simple. The level parameter is just a Bernoulli probability of an element of the tensor being set to 0. By doing sub with a [0, 1]-uniform distributed set of numbers and level, we make a level fraction of the elements negative, in expectation. Then the step sets the value of those negative elements to zero and the rest to one.

See the implementation in tensorflow (python) , which uses floor, but is mathematically equivalent:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py#L2323

Thank you, I really appreciate it.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1.
Reviewable status: 0 of 1 approvals obtained

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r3.
Reviewable status: 0 of 1 approvals obtained

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 3 files at r2, 1 of 1 files at r3.
Reviewable status: 0 of 1 approvals obtained

Copy link
Collaborator

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT)


src/ops/dropout.ts, line 41 at r3 (raw file):

unction dropout_(

What will be the public namespace of this new function? TensorFlow (Python) has tf.nn.dropout. Should we try to match that here, @nsthorat ?


src/ops/dropout.ts, line 44 at r3 (raw file):

 TODO: implement non default noise shape (VariableVasasMT)

nit: Use TODO formatting: TODO(VariableVasasMT): ...


src/ops/dropout.ts, line 54 at r3 (raw file):

div(1, sub(1, level)) as Scalar).mul(multiplier)

Can this be simplified as multiplier.div(sub(1, level) as Scalar)?

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT and @caisq)


src/ops/dropout.ts, line 44 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…
 TODO: implement non default noise shape (VariableVasasMT)

nit: Use TODO formatting: TODO(VariableVasasMT): ...

Done.


src/ops/dropout.ts, line 54 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…
div(1, sub(1, level)) as Scalar).mul(multiplier)

Can this be simplified as multiplier.div(sub(1, level) as Scalar)?

I think multiplying a scalar to all the tensor indexes is cheaper than division in terms of computational cost, I think we should stick with multiplication, specially with floating point calculations, I know its an unnecessary optimization, but looking at the usage of the library, I think we should be careful.

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT and @caisq)


src/ops/dropout.ts, line 54 at r3 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

I think multiplying a scalar to all the tensor indexes is cheaper than division in terms of computational cost, I think we should stick with multiplication, specially with floating point calculations, I know its an unnecessary optimization, but looking at the usage of the library, I think we should be careful.

Why would this be cheaper? In this implementation we have an extra multiplication shader, which is an extra draw call. I agree with @caisq request :)

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4.
Reviewable status: 0 of 1 approvals obtained (waiting on @VariableVasasMT and @caisq)


src/ops/dropout.ts, line 54 at r3 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

Why would this be cheaper? In this implementation we have an extra multiplication shader, which is an extra draw call. I agree with @caisq request :)

fair enought, I thought it would be cheaper because in the end we are either multiplying or dividing each index of a tensor, const res = dy.div($b.toFloat());, so divide once and multiply again and again might be cheaper.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq)


src/ops/dropout.ts, line 41 at r3 (raw file):

Previously, caisq (Shanqing Cai) wrote…
unction dropout_(

What will be the public namespace of this new function? TensorFlow (Python) has tf.nn.dropout. Should we try to match that here, @nsthorat ?

@nsthorat I dont have any idea about this, its your call.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq)


src/ops/dropout.ts, line 54 at r3 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

fair enought, I thought it would be cheaper because in the end we are either multiplying or dividing each index of a tensor, const res = dy.div($b.toFloat());, so divide once and multiply again and again might be cheaper.

I will test is for large tensors and see.

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @caisq)


src/ops/dropout.ts, line 54 at r3 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

I will test is for large tensors and see.


*************** with tensor x = tf.range(1, 4000001).reshape([2000000, 2]); *******************


experiment with:
multiplier = multiplier.div(sub(1, level) as Scalar);
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.043 secs: ModelManagement test-webgl2 {"WEBGL_VERSION":2} Failed copyModel
LOG: 351
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.833 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 333
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.472 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 201
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.306 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 117
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.218 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 510
Chrome 70.0.3538 (Linux 0.0.0) SLOW 10.455 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 1398

experiment with:
multiplier = (div(1, sub(1, level)) as Scalar).mul(multiplier);
LOG: 357
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.954 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 306
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.543 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 200
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.08 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 117
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.247 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 488
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.151 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 965
Chrome 70.0.3538 (Linux 0.0.0) SLOW 1.775 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0.75


***************** with tensor x = tf.range(1, 8000001).reshape([4000000, 2]); *****************


experiment with:
multiplier = multiplier.div(sub(1, level) as Scalar);

Chrome 70.0.3538 (Linux 0.0.0) SLOW 7.747 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 584
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.039 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 309
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.898 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 252
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.061 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 1542
Chrome 70.0.3538 (Linux 0.0.0) SLOW 11.635 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 4030
Chrome 70.0.3538 (Linux 0.0.0) SLOW 9.814 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0.75

|||||||||||||||||||||||||||||||||||||||||||||||||||||

LOG: 599
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.615 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 542
Chrome 70.0.3538 (Linux 0.0.0) SLOW 2.858 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 297
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.354 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 259
Chrome 70.0.3538 (Linux 0.0.0) SLOW 2.358 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 1122
Chrome 70.0.3538 (Linux 0.0.0) SLOW 10.62 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 4091
Chrome 70.0.3538 (Linux 0.0.0) SLOW 11.404 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0.75

experiment with:
multiplier = (div(1, sub(1, level)) as Scalar).mul(multiplier);

LOG: 597
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.719 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 554
Chrome 70.0.3538 (Linux 0.0.0) SLOW 2.892 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 311
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.256 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 248
Chrome 70.0.3538 (Linux 0.0.0) SLOW 3.131 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 1484
Chrome 70.0.3538 (Linux 0.0.0) SLOW 12.263 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 3697
Chrome 70.0.3538 (Linux 0.0.0) SLOW 9.338 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0.75

|||||||||||||||||||||||||||||||||||||||||||||||||||||

LOG: 606
Chrome 70.0.3538 (Linux 0.0.0) SLOW 7.311 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0
LOG: 537
Chrome 70.0.3538 (Linux 0.0.0) SLOW 2.881 secs: dropout test-webgl1 {"WEBGL_VERSION":1} Level = 0.75
LOG: 307
Chrome 70.0.3538 (Linux 0.0.0) SLOW 6.488 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0
LOG: 248
Chrome 70.0.3538 (Linux 0.0.0) SLOW 2.539 secs: dropout test-webgl2 {"WEBGL_VERSION":2} Level = 0.75
LOG: 1111
Chrome 70.0.3538 (Linux 0.0.0) SLOW 10.807 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0
LOG: 3456
Chrome 70.0.3538 (Linux 0.0.0) SLOW 9.003 secs: dropout test-cpu {"HAS_WEBGL":false} Level = 0.75

So it is equivalent with @caisq for larger tensors and adds to code readability. I guess I was wrong, I will make the changes ;)

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @caisq and @nsthorat)

a discussion (no related file):
The branch is updated on github, but reviewable shows Pull request branch is out-of-date with target branch and cannot be merged


Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @caisq and @nsthorat)

a discussion (no related file):

Previously, VariableVasasMT (VariableVasas) wrote…

The branch is updated on github, but reviewable shows Pull request branch is out-of-date with target branch and cannot be merged

@dsmilkov please review, I have applied all the changes you asked


@VariableVasasMT
Copy link
Contributor Author

@dsmilkov please review

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r19.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @caisq and @nsthorat)

@VariableVasasMT
Copy link
Contributor Author

@dsmilkov please review

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great contribution! And sorry for the delay. This is good to go! Wohoo!

Reviewed 1 of 1 files at r15, 1 of 1 files at r19.
Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @caisq and @nsthorat)


src/ops/mersenne-rand.ts, line 1 at r10 (raw file):

Previously, VariableVasasMT (VariableVasas) wrote…

this file is deleted, but somehow still showing here, I think reviewable is displaying it deleted

Sounds good!

@dsmilkov
Copy link
Contributor

dsmilkov commented Apr 30, 2019

One quick ask: the build fails since you added a new unit test and now have to re-run yarn build in order to update the auto-generated src/tests.ts. Once you commit the updated src/tests.ts, reply here and we can merge this!

See log: https://console.cloud.google.com/gcr/builds/26197cc6-bcc6-4139-978d-ce62447724f6?project=834911136599

@VariableVasasMT
Copy link
Contributor Author

VariableVasasMT commented Apr 30, 2019

One quick ask: the build fails since you added a new unit test and now have to re-run yarn build in order to update the auto-generated src/tests.ts. Once you commit the updated src/tests.ts, reply here and we can merge this!

See log: https://pantheon.corp.google.com/cloud-build/builds/26197cc6-bcc6-4139-978d-ce62447724f6?project=learnjs-174218

@dsmilkov fixed, the link you gave only accepts @google.com address and I dont work for google 😆 hence never been able to access google cloud site to see whats wrong with the test, local builds were always a success

Copy link
Contributor Author

@VariableVasasMT VariableVasasMT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r21.
Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @caisq and @nsthorat)

@dsmilkov dsmilkov merged commit e2d24ca into tensorflow:master Apr 30, 2019
caisq pushed a commit to tensorflow/tfjs-layers that referenced this pull request Jun 8, 2019
Change

This PR cleans up a TODO. As dropout op has been checked in through PR [tensorflow/tfjs-core#1343](tensorflow/tfjs-core#1343) and released, this PR safely switches `K.dropout` implementation to tfjs-core's `dropout` op. The switch of `K.dropout` implementation would benefit further dropout op related development.

Some changes in switching:
* As `tfc.dropout` op would assert `NotImplementedError` for `noiseShape`, removed the assertion in `K.dropout`.
* As `seed` has already supported in `tf.randomUniform`, removed `NotImplementedError` for it.
* The test cases for `SimpleRNN Tensor`, `GRU Tensor`, `LSTM Tensor` spy on function `tfc.randomUniform` which is used in K.dropout before, as the implementation with `tfc.randomUniform` is gone, replaced spy on `tfc.randomUniform` with spy on `tfc.dropout` which has the same called times. 
* When RNN cell calling the `K.dropout`, it will create an anonymous Tensor, for example, [src/layers/recurrent.ts#L2563](https://github.com/tensorflow/tfjs-layers/blob/master/src/layers/recurrent.ts#L2563), we can not track it, so removed the `toHaveBeenCalledWith` check.

DEV
dsmilkov pushed a commit that referenced this pull request Aug 9, 2019
This PR is a followup PR for [tfjs-core#1343](#1343) which added `dropout` op, dropout feature initially requested in [tfjs#117](tensorflow/tfjs#117). This PR cleans up a TODO, `implements non default noise shape`. The implementation of noiseShape aligns with TensorFlow Python's dropout API.

The `noiseShape` feature would benefit further dropout related development, for example, make `tf.layers.dropout` support `noiseShape` configuration.

**Relative PR:**
* Make dropout layer support non default noiseShape and seed [tensorflow/tfjs-layers#556](tensorflow/tfjs-layers#556)

**Reference:**
* [TensorFlow tf.nn.dropout documentation](https://www.tensorflow.org/api_docs/python/tf/nn/dropout)
* [TensorFlow tf.nn.dropout implementation](https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/python/ops/nn_ops.py#L2982-L3054)
* [TensorFlow _get_noise_shape implementation](https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/python/ops/nn_ops.py#L2903-L2925)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add dropout to tfjs-core.
4 participants