Speedup matmul when the matrices are vectors #1379

dsmilkov · 2018-11-05T21:47:49Z

In the WebGL backend, we can parallelize dot products by calling tf.mul(x,y).sum() -- this is due to the fact that the WebGL's sum() uses divide-and-conquer and is O(sqrt(N)).

Have matmul() in WebGL call tf.mul(x, y).sum() when we have matrix x vector, vector x matrix, or vector x vector and the shared dimension is longer than 1000 (see benchmark below).

Fixes tensorflow/tfjs#881

Benchmark of [1,K]x[K,1] for different Ks:
=== OLD ===
LOG: 'K = 1, Took 0.72 ms'
LOG: 'K = 10, Took 0.67 ms'
LOG: 'K = 100, Took 0.59 ms'
LOG: 'K = 1000, Took 0.64 ms' --- slower after K>1000.
LOG: 'K = 10000, Took 1.18 ms'
LOG: 'K = 100000, Took 10.66 ms'
LOG: 'K = 1000000, Took 110.90 ms'

=== NEW ===
LOG: 'K = 1, Took 0.62 ms'
LOG: 'K = 10, Took 0.62 ms'
LOG: 'K = 100, Took 0.67 ms'
LOG: 'K = 1000, Took 0.67 ms' --- faster after K>1000.
LOG: 'K = 10000, Took 1.12 ms'
LOG: 'K = 100000, Took 0.88 ms'
LOG: 'K = 1000000, Took 0.94 ms'

Benchmarked mobilenet, coco-ssd, and posenet -- no perf impact there.

PERF

This change is

dsmilkov · 2018-11-05T22:00:41Z

Hold on for a bit, I'm going to generalize this further for any matrix-vector, vector-matrix product based on the excellent observation of @DirkToewe in tensorflow/tfjs#881 (comment)

dsmilkov · 2018-11-06T14:55:04Z

This is ready for review

nsthorat

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @nsthorat, and @annxingyuan)

src/ops/matmul_test.ts, line 293 at r2 (raw file):

  });

  it('batched matmul with the matrices being vectors', () => {

test for matrix x vector and vise versa?

annxingyuan

Reviewed 2 of 2 files at r2.
Reviewable status: complete! 1 of 1 approvals obtained (waiting on @dsmilkov and @annxingyuan)

dsmilkov

Reviewable status: complete! 1 of 1 approvals obtained

src/ops/matmul_test.ts, line 293 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

test for matrix x vector and vise versa?

Done.

annxingyuan

This looks great! Will be cool to see whether any users of models in the wild report speedups.

VariableVasasMT

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @dsmilkov)

src/ops/matmul_test.ts, line 322 at r3 (raw file):

    expectArraysClose(result, [100, 100, 100, 100, 100, 100]);
  });

This change has broken the travis build and needs to be looked into ( @dsmilkov )
This is specifically for safari.

Safari 11.1.2 (Mac OS X 10.13.6) matmul cpu {"HAS_WEBGL":false} batched matmul with matrix x vector FAILED
	Error: Arrays differ: actual[1] = 99.89019775390625, expected[1] = 100.
	Actual:   100.00019836425781,99.89019775390625,99.5501937866211,99.55020141601562,99.3102035522461,99.29019927978516.
	Expected: 100,100,100,100,100,100. in src/test_util.js (line 71)
	expectArraysClose@src/test_util.ts:100:10 <- src/test_util.js:71:28
	src/ops/matmul_test.ts:310:22 <- src/ops/matmul_test.js:219:38
	<Jasmine>

#1379 added a unit test that failed in travis due to numerical precision: https://travis-ci.org/tensorflow/tfjs-core/jobs/451654174#L801 This PR: - fixes that test - Also fixes `test-travis.sh` to make sure we don't forward webgl --> cpu except for the last browser run -- probably caused during conflict resolving of #1371 and #1372 (https://reviewable.io/reviews/tensorflow/tfjs-core/1371#-LQ_a2y92YYLRMiYWqhC) DEV

dsmilkov added 2 commits November 5, 2018 16:21

save

a69ac83

save

84cc4d5

dsmilkov requested review from nsthorat and annxingyuan November 5, 2018 21:47

dsmilkov added 3 commits November 5, 2018 18:04

save

c3ca140

save

0b5232e

save

fca603b

Merge remote-tracking branch 'origin' into webgl-matmul

65cb56a

nsthorat approved these changes Nov 6, 2018

View reviewed changes

annxingyuan reviewed Nov 6, 2018

View reviewed changes

save

f07d810

dsmilkov commented Nov 6, 2018

View reviewed changes

annxingyuan approved these changes Nov 6, 2018

View reviewed changes

save

2672ad9

dsmilkov merged commit 5b4e187 into master Nov 6, 2018

dsmilkov deleted the webgl-matmul branch November 6, 2018 16:34

VariableVasasMT reviewed Nov 7, 2018

View reviewed changes

dsmilkov mentioned this pull request Nov 7, 2018

Fix failing test in travis #1381

Merged

caisq mentioned this pull request Nov 14, 2018

increase version to v.0.13.11 #1389

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speedup matmul when the matrices are vectors #1379

Speedup matmul when the matrices are vectors #1379

Uh oh!

dsmilkov commented Nov 5, 2018 •

edited

Loading

Uh oh!

dsmilkov commented Nov 5, 2018

Uh oh!

dsmilkov commented Nov 6, 2018

Uh oh!

nsthorat left a comment

Uh oh!

annxingyuan left a comment

Uh oh!

dsmilkov left a comment

Uh oh!

annxingyuan left a comment

Uh oh!

VariableVasasMT left a comment

Uh oh!

Uh oh!

Speedup matmul when the matrices are vectors #1379

Speedup matmul when the matrices are vectors #1379

Uh oh!

Conversation

dsmilkov commented Nov 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsmilkov commented Nov 5, 2018

Uh oh!

dsmilkov commented Nov 6, 2018

Uh oh!

nsthorat left a comment

Choose a reason for hiding this comment

Uh oh!

annxingyuan left a comment

Choose a reason for hiding this comment

Uh oh!

dsmilkov left a comment

Choose a reason for hiding this comment

Uh oh!

annxingyuan left a comment

Choose a reason for hiding this comment

Uh oh!

VariableVasasMT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dsmilkov commented Nov 5, 2018 •

edited

Loading