This tutorial is based on tutorial TensorFlow for Mobile Poets.
There are three steps of optimization:
- Basic optimization for inference
- Quantization (reduces compressed size of graph)
- Memory mapping (improves stability)
For tutorial is used retrained_graph.pb from tensorflow image retraining example.
Let’s start from easy way.
Easy way
Download tensorflow sources
git clone -b 0.12.1 --depth=1 https://github.com/tensorflow/tensorflow.git
Optimize graph
python ./tensorflow/tensorflow/python/tools/optimize_for_inference.py \
--input=retrained_graph.pb \
--output=optimized_graph.pb \
--frozen_graph=True \
--input_names=Mul \
--output_names=final_result
Quantize graph
python ./tensorflow/tensorflow/tools/quantization/quantize_graph.py \
--input=optimized_graph.pb \
--output=rounded_graph.pb \
--output_node_names=final_result \
--mode=weights_rounded
Memory mapping in Docker container via pre-built binary.
# run docker container
docker run --rm -it -v $PWD:/tf_files ubuntu:16.04
# download pre-built binary
apt-get update && apt-get install -y wget
wget https://github.com/dato-ml/stuff/releases/download/0.0.1/convert_graphdef_memmapped_format_0.12.1.tar.gz
tar -zxf convert_graphdef_memmapped_format_0.12.1.tar.gz
# memory mapping
./convert_graphdef_memmapped_format_0.12.1 \
--in_graph=./tf_files/rounded_graph.pb \
--out_graph=./tf_files/mmapped_graph.pb
exit
Hard way
Only if you like long builds, please welcome.
Note: result will be the same.
# run docker container
docker run -it -p 8888:8888 -v $PWD:/tf_files tensorflow/tensorflow:0.12.1-devel
cd /tensorflow/
# fix dependency url
sed -i 's_http://zlib.net/zlib-1.2.8.tar.gz_http://zlib.net/fossils/zlib-1.2.8.tar.gz_' tensorflow/workspace.bzl
# build optimization graph tool
bazel build tensorflow/python/tools:optimize_for_inference \
-c opt --copt=-mavx --verbose_failures \
--local_resources 2048,2.0,1.0 -j 1
# optimize graph
bazel-bin/tensorflow/python/tools/optimize_for_inference \
--input=/tf_files/retrained_graph.pb \
--output=/tf_files/optimized_graph.pb \
--input_names=Mul \
--output_names=final_result
# build quantization graph tool
bazel build tensorflow/tools/quantization:quantize_graph \
-c opt --copt=-mavx --verbose_failures \
--local_resources 2048,2.0,1.0 -j 1
# quantize graph
bazel-bin/tensorflow/tools/quantization/quantize_graph \
--input=/tf_files/optimized_graph.pb \
--output=/tf_files/rounded_graph.pb \
--output_node_names=final_result \
--mode=weights_rounded
# build memory mapping tool
bazel build tensorflow/contrib/util:convert_graphdef_memmapped_format \
-c opt --copt=-mavx --verbose_failures \
--local_resources 2048,2.0,1.0 -j 1
# memory mapping
bazel-bin/tensorflow/contrib/util/convert_graphdef_memmapped_format \
--in_graph=/tf_files/rounded_graph.pb \
--out_graph=/tf_files/mmapped_graph.pb
exit
Created mmapped_graph.pb you can use in mobile apps.