Running Tensorflow with Docker on GCP

Provision Virtual Machine

$ gcloud auth login

$ gcloud config set project machine-learning-000000 # Your project id

$ gcloud beta compute \
  addresses create mlvm \
  --region=us-east1 \

$ MLVM_IP="$(gcloud beta compute \
  addresses describe mlvm \
  --region=us-east1 \
  | head -n1 | awk '{print $2}')"

$ gcloud beta compute \
  instances create mlvm \
  --zone=us-east1-b \
  --machine-type=n1-standard-2 \
  --subnet=default \
  --network-tier=PREMIUM \
  --address="$MLVM_IP" \
  --maintenance-policy=TERMINATE \
  --no-service-account \
  --no-scopes \
  --accelerator=type=nvidia-tesla-p100,count=1 \
  --image=centos-7-v20181011 \
  --image-project=centos-cloud \
  --boot-disk-size=40GB \
  --boot-disk-type=pd-standard \

Configure Virtual Machine

$ gcloud beta compute ssh user@mlvm

$ sudo su

$ cd ~/

$ curl \
  > /etc/yum.repos.d/docker-ce.repo

$ curl \
  > /etc/yum.repos.d/nvidia-docker.repo

$ yum install --assumeyes \
  "@Development Tools" \
  "kernel-devel-$(uname -r)" \
  "kernel-headers-$(uname -r)" \
  "docker-ce-18.06.1" \

$ curl \

$ sh --silent

$ systemctl enable docker

$ systemctl start docker

$ docker run \
  --runtime=nvidia \
  -it \
  --rm \
  tensorflow/tensorflow:1.11.0-devel-gpu \
  python -c "import tensorflow as tf; print(tf.contrib.eager.num_gpus())"


Destroy Virtual Machine

$ exit # Exit from 'sudo su'

$ exit # Exit from 'gcloud beta compute ssh user@mlvm'

$ gcloud beta compute \
  addresses delete mlvm \

$ gcloud beta compute \
  instances delete mlvm \

Benchmarking Ruby with GCC (4.4, 4.7, 4.8, 4.9) and Clang (3.2, 3.3, 3.4, 3.5)

This post is partially inspired by Braulio Bhavamitra’s comments about Ruby being faster when compiled with Clang rather than GCC and partially by Brendan Gregg’s comments about compiler optimisation during his Flame Graphs talk at USENIX LISA13 (0:33:30).

In short I wanted to look at what kind of performance we are leaving on the table by not taking advantage of 1) The newest compiler versions & 2) The most aggressive compiler optimizations. This is especially perniant to those of us deploying applications on PaaS infrastructure where we often have zero control over such things. Does the cost-benefit analysis still work out the same when you take into account a 10/20/30% performance hit?

All tests were run on AWS from an m3.medium EC2 instance and the AMI used was a modified copy of one of my weekly generated Gentoo Linux AMIs. The version of Ruby was 2.1 while the tests themselves are from Antonio Cangiano’s Ruby Benchmark Suite. The tooling used to run them is available on my GitHub if you want to try this out for yourself.

The full test suite was run for each of the following compiler variants, O3 was not used with Clang since it only adds a single additional flag:

  • GCC 4.4 with O2 – Ships with Ubuntu 10.04 (Lucid) & RHEL/CentOS 6
  • GCC 4.4 with O3
  • GCC 4.7 with O2 – Ships with Debian 7 (Wheezy) & Ubuntu 12.04 (Precise)
  • GCC 4.7 with O3
  • GCC 4.8 with O2 – Ships with Ubuntu 14.04 (Trusty) & RHEL/CentOS 7
  • GCC 4.8 with O3
  • GCC 4.9 with O2 – Ships with Debian 8 (Jessie)
  • GCC 4.9 with O3
  • Clang 3.2 with O2
  • Clang 3.3 with O2
  • Clang 3.4 with O2
  • Clang 3.5 with O2

Each variant was then given a number of points per test based on it’s ranking, 0 points to the variant which performed the best, 1 to the second best, and so on until 11 points were given to the variant which performed the worst.

These scores were then added up per variant and plotted onto a bar graph to try and visualize performance per variant.

From this we can determine that:

  1. Your choice of compiler does have a non-negligible affect on the performance of your runtime environment.
  2. Modern versions of GCC (4.7 & 4.8) and Clang (3.2 & 3.3) have very similar performance.
  3. Clang 3.4 seems to suffer from some performance regressions in this context.
  4. The latest version of GCC (4.9) is ahead by a clear margin.
  5. All O3 variants expect GCC 4.8 performed worse than their O2 counterparts. This is not that unusual and very often using O3 will degrade performance or
    even break an application all together. However the default Makefile shipped with Ruby 1.9.3 and above uses O3, which appears to hurt performance.

Of course the standard disclaimers apply. Benchmarking correctly is hard, you may not see the same results in your specific environment, do not immediately recompile everything in prod using GCC 4.9, etc.


Lots of people asked to see the raw data plotted as well as the relative performance, so here it is. For each test the average score for all varients was calculated as this was named as the baseline and marked as 0. Then for each test/varient a percentage was calculated showing how much faster/slower it was than the baseline.

For example on test eight GCC 4.9 O2 was 7% faster than the baseline while Clang 3.5 was 2% faster than the baseline. From this we can infer that GCC 4.9 O2 was 5% faster than Clang 3.5 in that test.

Since this makes the graph very cluttered it is best that you only select a few variants at once, you can also pan and zoom.

Listing EC2 instances in all regions

When working with EC2 instances across multiple regions I’ve found it’s near impossible to get a good overview of what is running where. This can be especially annoying when you are automatically launching a number of short lived instances.

To prevent me having to go through 9 different web pages to see what I currently have running I found it easier to just use the API and list active instances from the CLI.

Install dependencies:

$ gem install aws-sdk pmap



require 'aws-sdk'
require 'pmap'

def ec2(region = 'us-east-1')
    ec2 =
        access_key_id: ENV['AWS_ACCESS_KEY'],
        secret_access_key: ENV['AWS_SECRET_KEY'],
        region: region

def list_instances
    instances = []
    ec2.regions.peach do |region|
        ec2.regions[].instances.peach do |instance|
            next if instance.status == :terminated
            instances << instance

list_instances.peach do |instance|
    puts "#{}\t\t#{instance.availability_zone}\t\t#{instance.status}\t\t#{instance.ip_address}\n"

Listing instances:

$ aws-list

i-16b78754  eu-west-1a          running
i-0025e1e6  eu-west-1a          running
i-3926e2df  eu-west-1a          running
i-4924e0af  eu-west-1a          running
i-c424e022  eu-west-1a          running
i-0c25e1ea  eu-west-1a          running
i-9c25e17a  eu-west-1a          running
i-4b24e0ad  eu-west-1a          running
i-33e929f2  eu-central-1b       running
i-c324e025  eu-west-1a          running
i-3f26e2d9  eu-west-1a          running
i-8027e366  eu-west-1a          running
i-0d25e1eb  eu-west-1a          running
i-d718edd9  us-west-2c          running
i-0c2028e6  us-east-1a          running
i-5b95e54e  sa-east-1a          running
i-2dad38de  ap-northeast-1a     running
i-625a80af  ap-southeast-1a     running
i-a06e006f  ap-southeast-2a     running
i-2dbce5e5  us-west-1a          running

Installing Vagrant in non-supported environments

Get sources:

$ git clone
$ cd vagrant
$ git checkout tags/v1.6.5

Install dependencies:

$ gem install bundler -v '< 1.7.0'
$ bundle install

Patch Vagrant[1][2][3]:

diff --git a/bin/vagrant b/bin/vagrant
index 21630e1..5e24279 100755
--- a/bin/vagrant
+++ b/bin/vagrant
@@ -66,6 +66,8 @@ end

 # Setup our dependencies by initializing Bundler. If we're using plugins,
 # then also initialize the paths to the plugins.
+load_path = []
+$LOAD_PATH.each { |path| load_path << path }
 require "bundler"
   Bundler.setup(:default, :plugins)
@@ -94,6 +96,7 @@ rescue Bundler::VersionConflict => e
   $stderr.puts e.message
   exit 1
+load_path.each { |path| $LOAD_PATH.push(path) unless $LOAD_PATH.include?(path) }

 # Stdout/stderr should not buffer output
 $stdout.sync = true
@@ -164,11 +167,6 @@ begin
   logger.debug("Creating Vagrant environment")
   env =

-  if !Vagrant.in_installer? && !Vagrant.very_quiet?
-    # If we're not in the installer, warn.
-    env.ui.warn(I18n.t("vagrant.general.not_in_installer") + "\n", prefix: false)
-  end
     # Execute the CLI interface, and exit with the proper error code
     exit_status = env.cli(argv)
diff --git a/lib/vagrant/bundler.rb b/lib/vagrant/bundler.rb
index 05867da..54f9fb8 100644
--- a/lib/vagrant/bundler.rb
+++ b/lib/vagrant/bundler.rb
@@ -18,8 +18,7 @@ module Vagrant

     def initialize
-      @enabled = true if ENV["VAGRANT_INSTALLER_ENV"] ||
+      @enabled  = true
       @enabled  = !::Bundler::SharedHelpers.in_bundle? if !@enabled
       @monitor  =


Test and install:

$ rake test:unit
$ rake install

[1]: Without this patch Vagrant will give the following warning:

$ vagrant status

You appear to be running Vagrant outside of the official installers.
Note that the installers are what ensure that Vagrant has all required
dependencies, and Vagrant assumes that these dependencies exist. By
running outside of the installer environment, Vagrant may not function
properly. To remove this warning, install Vagrant using one of the
official packages from

[2]: Without this patch Vagrant will give the following error:

$ vagrant plugin install vagrant-aws

Installing the 'vagrant-aws' plugin. This can take a few minutes...
Vagrant's built-in bundler management mechanism is disabled because
Vagrant is running in an external bundler environment. In these
cases, plugin management does not work with Vagrant. To install
plugins, use your own Gemfile. To load plugins, either put the
plugins in the `plugins` group in your Gemfile or manually require
them in a Vagrantfile.

[3]: Without this patch Vagrant will give an error when running in a directory containing a Gemfile.


VMware’s ESXi 5.5 increases the recommend memory requirement from 4GB to 8GB, their own System Requirements document stating that:

“You have 4GB RAM. This is the minimum required to install ESXi 5.5. Provide at least 8GB of RAM to take full advantage of ESXi features and run virtual machines in typical production environments.”

However when installing ESXi on a system with 4GB of RAM you will receive an error along the lines of:

<MEMORY_SIZE ERROR: This host has 3.71 GiB of RAM. 3.97 GiB are needed>

You’ll notice that the people writing the System Requirements document are using the SI unit of gigabyte (GB) while those writing the ESXi installer are using the binary unit of gibibyte (GiB). As such ESXi does not require 4,000,000,000 bytes of RAM but 4,294,967,296.

Luckily the fix is easy, we can modify the amount minimum amount of RAM the installer checks for and it will install without issue.

Switch to the virtual terminal by hitting Alt+F1 and login as ‘root’ with the password field blank.

After logging in you need to tweak the permissions on

$ cd /usr/lib/vmware/weasel/util/
$ rm upgrade_precheck.pyc
$ cp
$ cp
$ chmod 777

Open up in vi and replace:

MEM_MIN_SIZE = (4 * 1024 - 32) * SIZE_MiB


MEM_MIN_SIZE = (2 * 1024 - 32) * SIZE_MiB

Then restart the ESXi installer by killing the weasel process.

$ ps -c | grep weasel
$ kill 12345

You will automatically get switched away from the virtual terminal and can continue the installation.