Skip to content

Commit 4364390

Browse files
Ivan Bogatyycalberti
authored andcommitted
Release DRAGNN bulk networks (tensorflow#2785)
* Release DRAGNN bulk networks
1 parent 638fd75 commit 4364390

File tree

166 files changed

+30228
-1493
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

166 files changed

+30228
-1493
lines changed

research/syntaxnet/Dockerfile

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,4 @@
1-
# Java baseimage, for Bazel.
2-
FROM openjdk:8
1+
FROM ubuntu:16.10
32

43
ENV SYNTAXNETDIR=/opt/tensorflow PATH=$PATH:/root/bin
54

@@ -21,13 +20,15 @@ RUN mkdir -p $SYNTAXNETDIR \
2120
libopenblas-dev \
2221
libpng-dev \
2322
libxft-dev \
24-
patch \
23+
openjdk-8-jdk \
2524
python-dev \
2625
python-mock \
2726
python-pip \
2827
python2.7 \
2928
swig \
29+
unzip \
3030
vim \
31+
wget \
3132
zlib1g-dev \
3233
&& apt-get clean \
3334
&& (rm -f /var/cache/apt/archives/*.deb \
@@ -55,7 +56,7 @@ RUN python -m pip install \
5556
--py --sys-prefix widgetsnbextension \
5657
&& rm -rf /root/.cache/pip /tmp/pip*
5758

58-
# Installs the latest version of Bazel.
59+
# Installs Bazel.
5960
RUN wget --quiet https://github.com/bazelbuild/bazel/releases/download/0.5.4/bazel-0.5.4-installer-linux-x86_64.sh \
6061
&& chmod +x bazel-0.5.4-installer-linux-x86_64.sh \
6162
&& ./bazel-0.5.4-installer-linux-x86_64.sh \
@@ -65,13 +66,11 @@ COPY WORKSPACE $SYNTAXNETDIR/syntaxnet/WORKSPACE
6566
COPY tools/bazel.rc $SYNTAXNETDIR/syntaxnet/tools/bazel.rc
6667
COPY tensorflow $SYNTAXNETDIR/syntaxnet/tensorflow
6768

68-
# Workaround solving the PYTHON_BIN_PATH not found problem
69-
ENV PYTHON_BIN_PATH=/usr/bin/python
7069
# Compile common TensorFlow targets, which don't depend on DRAGNN / SyntaxNet
7170
# source. This makes it more convenient to re-compile DRAGNN / SyntaxNet for
7271
# development (though not as convenient as the docker-devel scripts).
7372
RUN cd $SYNTAXNETDIR/syntaxnet/tensorflow \
74-
&& ./configure CPU \
73+
&& tensorflow/tools/ci_build/builds/configured CPU \
7574
&& cd $SYNTAXNETDIR/syntaxnet \
7675
&& bazel build -c opt @org_tensorflow//tensorflow:tensorflow_py
7776

@@ -92,4 +91,4 @@ EXPOSE 8888
9291
COPY examples $SYNTAXNETDIR/syntaxnet/examples
9392
# Todo: Move this earlier in the file (don't want to invalidate caches for now).
9493

95-
CMD /bin/bash -c "bazel-bin/dragnn/tools/oss_notebook_launcher notebook --debug --notebook-dir=/opt/tensorflow/syntaxnet/examples --allow-root"
94+
CMD /bin/bash -c "bazel-bin/dragnn/tools/oss_notebook_launcher notebook --debug --notebook-dir=/opt/tensorflow/syntaxnet/examples"

research/syntaxnet/README.md

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,8 @@ This repository is largely divided into two sub-packages:
2323
[documentation](g3doc/DRAGNN.md),
2424
[paper](https://arxiv.org/pdf/1703.04474.pdf)** implements Dynamic Recurrent
2525
Acyclic Graphical Neural Networks (DRAGNN), a framework for building
26-
multi-task, fully dynamically constructed computation graphs. Practically, we
27-
use DRAGNN to extend our prior work from [Andor et al.
26+
multi-task, fully dynamically constructed computation graphs. Practically,
27+
we use DRAGNN to extend our prior work from [Andor et al.
2828
(2016)](http://arxiv.org/abs/1603.06042) with end-to-end, deep recurrent
2929
models and to provide a much easier to use interface to SyntaxNet. *DRAGNN
3030
is designed first and foremost as a Python library, and therefore much
@@ -54,20 +54,47 @@ There are three ways to use SyntaxNet:
5454

5555
### Docker installation
5656

57+
_This process takes ~10 minutes._
58+
5759
The simplest way to get started with DRAGNN is by loading our Docker container.
5860
[Here](g3doc/CLOUD.md) is a tutorial for running the DRAGNN container on
5961
[GCP](https://cloud.google.com) (just as applicable to your own computer).
6062

63+
### Ubuntu 16.10+ binary installation
64+
65+
_This process takes ~5 minutes, but is only compatible with Linux using GNU libc
66+
3.4.22 and above (e.g. Ubuntu 16.10)._
67+
68+
Binary wheel packages are provided for TensorFlow and SyntaxNet. If you do not
69+
need to write new binary TensorFlow ops, these should suffice.
70+
71+
* `apt-get install -y graphviz libgraphviz-dev libopenblas-base libpng16-16
72+
libxft2 python-pip python-mock`
73+
* `pip install pygraphviz
74+
--install-option="--include-path=/usr/include/graphviz"
75+
--install-option="--library-path=/usr/lib/graphviz/"`
76+
* `pip install 'ipython<6.0' protobuf numpy scipy jupyter
77+
syntaxnet-with-tensorflow`
78+
* `python -m jupyter_core.command nbextension enable --py --sys-prefix
79+
widgetsnbextension`
80+
81+
You can test that binary modules can be successfully imported by running,
82+
83+
* `python -c 'import dragnn.python.load_dragnn_cc_impl,
84+
syntaxnet.load_parser_ops'`
85+
6186
### Manual installation
6287

88+
_This process takes 1-2 hours._
89+
6390
Running and training SyntaxNet/DRAGNN models requires building this package from
6491
source. You'll need to install:
6592

6693
* python 2.7:
6794
* Python 3 support is not available yet
68-
* bazel:
95+
* bazel 0.5.4:
6996
* Follow the instructions [here](http://bazel.build/docs/install.html)
70-
* Alternately, Download bazel <.deb> from
97+
* Alternately, Download bazel 0.5.4 <.deb> from
7198
[https://github.com/bazelbuild/bazel/releases](https://github.com/bazelbuild/bazel/releases)
7299
for your system configuration.
73100
* Install it using the command: sudo dpkg -i <.deb file>
@@ -103,9 +130,12 @@ following commands:
103130
bazel test --linkopt=-headerpad_max_install_names \
104131
dragnn/... syntaxnet/... util/utf8/...
105132
```
133+
106134
Bazel should complete reporting all tests passed.
107135

108-
Now you can install the SyntaxNet and DRAGNN Python modules with the following commands:
136+
Now you can install the SyntaxNet and DRAGNN Python modules with the following
137+
commands:
138+
109139
```shell
110140
mkdir /tmp/syntaxnet_pkg
111141
bazel-bin/dragnn/tools/build_pip_package --output-dir=/tmp/syntaxnet_pkg
@@ -116,8 +146,6 @@ Now you can install the SyntaxNet and DRAGNN Python modules with the following c
116146
To build SyntaxNet with GPU support please refer to the instructions in
117147
[issues/248](https://github.com/tensorflow/models/issues/248).
118148

119-
120-
121149
**Note:** If you are running Docker on OSX, make sure that you have enough
122150
memory allocated for your Docker VM.
123151

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM dragnn-oss-test-base:latest
2+
3+
RUN rm -rf \
4+
$SYNTAXNETDIR/syntaxnet/dragnn \
5+
$SYNTAXNETDIR/syntaxnet/syntaxnet \
6+
$SYNTAXNETDIR/syntaxnet/third_party \
7+
$SYNTAXNETDIR/syntaxnet/util/utf8
8+
COPY dragnn $SYNTAXNETDIR/syntaxnet/dragnn
9+
COPY syntaxnet $SYNTAXNETDIR/syntaxnet/syntaxnet
10+
COPY third_party $SYNTAXNETDIR/syntaxnet/third_party
11+
COPY util/utf8 $SYNTAXNETDIR/syntaxnet/util/utf8
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
FROM ubuntu:16.10
2+
3+
ENV SYNTAXNETDIR=/opt/tensorflow PATH=$PATH:/root/bin
4+
5+
# Install system packages. This doesn't include everything the TensorFlow
6+
# dockerfile specifies, so if anything goes awry, maybe install more packages
7+
# from there. Also, running apt-get clean before further commands will make the
8+
# Docker images smaller.
9+
RUN mkdir -p $SYNTAXNETDIR \
10+
&& cd $SYNTAXNETDIR \
11+
&& apt-get update \
12+
&& apt-get install -y \
13+
file \
14+
git \
15+
graphviz \
16+
libcurl3-dev \
17+
libfreetype6-dev \
18+
libgraphviz-dev \
19+
liblapack-dev \
20+
libopenblas-dev \
21+
libpng-dev \
22+
libxft-dev \
23+
openjdk-8-jdk \
24+
python-dev \
25+
python-mock \
26+
python-pip \
27+
python2.7 \
28+
swig \
29+
unzip \
30+
vim \
31+
wget \
32+
zlib1g-dev \
33+
&& apt-get clean \
34+
&& (rm -f /var/cache/apt/archives/*.deb \
35+
/var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true)
36+
37+
# Install common Python dependencies. Similar to above, remove caches
38+
# afterwards to help keep Docker images smaller.
39+
RUN pip install --ignore-installed pip \
40+
&& python -m pip install numpy \
41+
&& rm -rf /root/.cache/pip /tmp/pip*
42+
RUN python -m pip install \
43+
asciitree \
44+
ipykernel \
45+
jupyter \
46+
matplotlib \
47+
pandas \
48+
protobuf \
49+
scipy \
50+
sklearn \
51+
&& python -m ipykernel.kernelspec \
52+
&& python -m pip install pygraphviz \
53+
--install-option="--include-path=/usr/include/graphviz" \
54+
--install-option="--library-path=/usr/lib/graphviz/" \
55+
&& python -m jupyter_core.command nbextension enable \
56+
--py --sys-prefix widgetsnbextension \
57+
&& rm -rf /root/.cache/pip /tmp/pip*
58+
59+
# Installs Bazel.
60+
RUN wget --quiet https://github.com/bazelbuild/bazel/releases/download/0.5.3/bazel-0.5.3-installer-linux-x86_64.sh \
61+
&& chmod +x bazel-0.5.3-installer-linux-x86_64.sh \
62+
&& JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ ./bazel-0.5.3-installer-linux-x86_64.sh \
63+
&& rm ./bazel-0.5.3-installer-linux-x86_64.sh
64+
65+
COPY WORKSPACE $SYNTAXNETDIR/syntaxnet/WORKSPACE
66+
COPY tools/bazel.rc $SYNTAXNETDIR/syntaxnet/tools/bazel.rc
67+
68+
# Compile common TensorFlow targets, which don't depend on DRAGNN / SyntaxNet
69+
# source. This makes it more convenient to re-compile DRAGNN / SyntaxNet for
70+
# development (though not as convenient as the docker-devel scripts).
71+
RUN cd $SYNTAXNETDIR/syntaxnet \
72+
&& git clone --branch r1.3 --recurse-submodules https://github.com/tensorflow/tensorflow \
73+
&& cd tensorflow \
74+
# This line removes a bad archive target which causes Tensorflow install
75+
# to fail.
76+
&& sed -i '\@https://github.com/google/protobuf/archive/0b059a3d8a8f8aa40dde7bea55edca4ec5dfea66.tar.gz@d' tensorflow/workspace.bzl \
77+
&& tensorflow/tools/ci_build/builds/configured CPU \\
78+
&& cd $SYNTAXNETDIR/syntaxnet \
79+
&& bazel build -c opt @org_tensorflow//tensorflow:tensorflow_py
80+
81+
# Just copy the code and run tests. The build and test flags differ enough that
82+
# doing a normal build of TensorFlow targets doesn't save much test time.
83+
WORKDIR $SYNTAXNETDIR/syntaxnet
84+
COPY dragnn $SYNTAXNETDIR/syntaxnet/dragnn
85+
COPY syntaxnet $SYNTAXNETDIR/syntaxnet/syntaxnet
86+
COPY third_party $SYNTAXNETDIR/syntaxnet/third_party
87+
COPY util/utf8 $SYNTAXNETDIR/syntaxnet/util/utf8
88+
89+
# Doesn't matter if the tests pass or not, since we're going to re-copy over the
90+
# code.
91+
RUN bazel test -c opt ... || true

research/syntaxnet/docker-devel/Dockerfile.min

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,9 @@
11
# You need to build wheels before building this image. Please consult
22
# docker-devel/README.txt.
3-
4-
# This is the base of the openjdk image.
53
#
64
# It might be more efficient to use a minimal distribution, like Alpine. But
75
# the upside of this being popular is that people might already have it.
8-
FROM buildpack-deps:jessie-curl
6+
FROM ubuntu:16.10
97

108
ENV SYNTAXNETDIR=/opt/tensorflow PATH=$PATH:/root/bin
119

@@ -19,7 +17,7 @@ RUN apt-get update \
1917
libgraphviz-dev \
2018
liblapack3 \
2119
libopenblas-base \
22-
libpng12-0 \
20+
libpng16-16 \
2321
libxft2 \
2422
python-dev \
2523
python-mock \
@@ -48,11 +46,13 @@ RUN python -m pip install \
4846
&& python -m pip install pygraphviz \
4947
--install-option="--include-path=/usr/include/graphviz" \
5048
--install-option="--library-path=/usr/lib/graphviz/" \
49+
&& python -m jupyter_core.command nbextension enable \
50+
--py --sys-prefix widgetsnbextension \
5151
&& rm -rf /root/.cache/pip /tmp/pip*
5252

53-
COPY syntaxnet_with_tensorflow-0.2-cp27-none-linux_x86_64.whl $SYNTAXNETDIR/
53+
COPY syntaxnet_with_tensorflow-0.2-cp27-cp27mu-linux_x86_64.whl $SYNTAXNETDIR/
5454
RUN python -m pip install \
55-
$SYNTAXNETDIR/syntaxnet_with_tensorflow-0.2-cp27-none-linux_x86_64.whl \
55+
$SYNTAXNETDIR/syntaxnet_with_tensorflow-0.2-cp27-cp27mu-linux_x86_64.whl \
5656
&& rm -rf /root/.cache/pip /tmp/pip*
5757

5858
# This makes the IP exposed actually "*"; we'll do host restrictions by passing
@@ -63,4 +63,4 @@ EXPOSE 8888
6363
# This does not need to be compiled, only copied.
6464
COPY examples $SYNTAXNETDIR/syntaxnet/examples
6565
# For some reason, this works if we run it in a bash shell :/ :/ :/
66-
CMD /bin/bash -c "python -m jupyter_core.command notebook --debug --notebook-dir=/opt/tensorflow/syntaxnet/examples"
66+
CMD /bin/bash -c "python -m jupyter_core.command notebook --debug --notebook-dir=/opt/tensorflow/syntaxnet/examples --allow-root"

research/syntaxnet/docker-devel/README.txt

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,11 +43,11 @@ Step 3: Building the development image
4343

4444
First, ensure you have the file
4545

46-
syntaxnet_with_tensorflow-0.2-cp27-none-linux_x86_64.whl
46+
syntaxnet_with_tensorflow-0.2-cp27-cp27mu-linux_x86_64.whl
4747

4848
in your working directory, from step 2. Then run,
4949

50-
docker build -t dragnn-oss:latest-minimal -f docker-devel/Dockerfile.min
50+
docker build -t dragnn-oss:latest-minimal -f docker-devel/Dockerfile.min .
5151

5252
If the filename changes (e.g. you are on a different architecture), just update
5353
Dockerfile.min.

research/syntaxnet/dragnn/__init__.py

Whitespace-only changes.

research/syntaxnet/dragnn/components/stateless/BUILD

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ cc_library(
1010
"//dragnn/core:component_registry",
1111
"//dragnn/core/interfaces:component",
1212
"//dragnn/core/interfaces:transition_state",
13-
"//dragnn/io:sentence_input_batch",
1413
"//dragnn/protos:data_proto",
1514
"//syntaxnet:base",
1615
],

research/syntaxnet/dragnn/components/stateless/stateless_component.cc

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@
1616
#include "dragnn/core/component_registry.h"
1717
#include "dragnn/core/interfaces/component.h"
1818
#include "dragnn/core/interfaces/transition_state.h"
19-
#include "dragnn/io/sentence_input_batch.h"
2019
#include "dragnn/protos/data.pb.h"
2120
#include "syntaxnet/base.h"
2221

@@ -25,7 +24,8 @@ namespace dragnn {
2524
namespace {
2625

2726
// A component that does not create its own transition states; instead, it
28-
// simply forwards the states of the previous component. Does not support all
27+
// simply forwards the states of the previous component. Requires that some
28+
// previous component has converted the input batch. Does not support all
2929
// methods. Intended for "compute-only" bulk components that only use linked
3030
// features, which use only a small subset of DRAGNN functionality.
3131
class StatelessComponent : public Component {
@@ -38,8 +38,7 @@ class StatelessComponent : public Component {
3838
void InitializeData(
3939
const std::vector<std::vector<const TransitionState *>> &parent_states,
4040
int max_beam_size, InputBatchCache *input_data) override {
41-
// Must use SentenceInputBatch to match SyntaxNetComponent.
42-
batch_size_ = input_data->GetAs<SentenceInputBatch>()->data()->size();
41+
batch_size_ = input_data->Size();
4342
beam_size_ = max_beam_size;
4443
parent_states_ = parent_states;
4544

@@ -84,31 +83,34 @@ class StatelessComponent : public Component {
8483
LOG(FATAL) << "[" << name_ << "] Method not supported";
8584
return nullptr;
8685
}
87-
void AdvanceFromPrediction(const float transition_matrix[],
88-
int matrix_length) override {
89-
LOG(FATAL) << "[" << name_ << "] Method not supported";
86+
bool AdvanceFromPrediction(const float *transition_matrix, int num_items,
87+
int num_actions) override {
88+
LOG(FATAL) << "[" << name_ << "] AdvanceFromPrediction not supported";
9089
}
9190
void AdvanceFromOracle() override {
92-
LOG(FATAL) << "[" << name_ << "] Method not supported";
91+
LOG(FATAL) << "[" << name_ << "] AdvanceFromOracle not supported";
9392
}
9493
std::vector<std::vector<int>> GetOracleLabels() const override {
9594
LOG(FATAL) << "[" << name_ << "] Method not supported";
96-
return {};
9795
}
9896
int GetFixedFeatures(std::function<int32 *(int)> allocate_indices,
9997
std::function<int64 *(int)> allocate_ids,
10098
std::function<float *(int)> allocate_weights,
10199
int channel_id) const override {
102100
LOG(FATAL) << "[" << name_ << "] Method not supported";
103-
return 0;
104101
}
105102
int BulkGetFixedFeatures(const BulkFeatureExtractor &extractor) override {
106103
LOG(FATAL) << "[" << name_ << "] Method not supported";
107-
return 0;
108104
}
105+
void BulkEmbedFixedFeatures(
106+
int batch_size_padding, int num_steps_padding, int output_array_size,
107+
const vector<const float *> &per_channel_embeddings,
108+
float *embedding_output) override {
109+
LOG(FATAL) << "[" << name_ << "] Method not supported";
110+
}
111+
109112
std::vector<LinkFeatures> GetRawLinkFeatures(int channel_id) const override {
110113
LOG(FATAL) << "[" << name_ << "] Method not supported";
111-
return {};
112114
}
113115
void AddTranslatedLinkFeaturesToTrace(
114116
const std::vector<LinkFeatures> &features, int channel_id) override {

0 commit comments

Comments
 (0)